# Advances in formal Slavic linguistics 2016

Edited by

Denisa Lenertová Roland Meyer Radek Šimík Luka Szucsich

Open Slavic Linguistics 1

### Open Slavic Linguistics

Editors: Berit Gehrke, Denisa Lenertová, Roland Meyer, Radek Šimík & Luka Szucsich

In this series:

1. Lenertová, Denisa, Roland Meyer, Radek Šimík & Luka Szucsich (Eds.). Advances in formal Slavic linguistics 2016.

# Advances in formal Slavic linguistics 2016

Edited by

Denisa Lenertová Roland Meyer Radek Šimík Luka Szucsich

Lenertová, Denisa, Roland Meyer, Radek Šimík & Luka Szucsich (eds.). 2018. *Advances in formal Slavic linguistics 2016* (Open Slavic Linguistics 1). Berlin: Language Science Press.

This title can be downloaded at: http://langsci-press.org/catalog/book/189 © 2018, the authors Published under the Creative Commons Attribution 4.0 Licence (CC BY 4.0): http://creativecommons.org/licenses/by/4.0/ ISBN: 978-3-96110-127-6 (Digital) 978-3-96110-128-3 (Hardcover) 978-3-96110-140-5 (Softcover) DOI:10.5281/zenodo.2546440 Source code available from www.github.com/langsci/189 Collaborative reading: paperhive.org/documents/remote?type=langsci&id=189

Cover and concept of design: Ulrike Harbort

Typesetting: Radek Šimík, Roland Meyer, Andrei Koniaev, Sebastian Nordhoff, and authors (alphabetically): Julia Bacskai-Atkari, Olga Borik, Mojmír Dočekal, Anja Gattnar, Berit Gehrke, Matías Guzmán Naranjo, Johanna Heininger, Robin Hörnig, Ivona Kučerová, Franc Marušič, Petra Mišmaš, Olav Mueller-Reichau, Gereon Müller, Andrew Murphy, Vesna Plesničar, Zorice Puškar, Tina Šuligoj, Maria D. Vasilyeva, Marcin Wągiel, Karolina Zuchewicz Proofreading: Denisa Lenertová, Roland Meyer, Radek Šimík, Luka Szucsich, Jake Walsh Fonts: Linux Libertine, Libertinus Math, Arimo, DejaVu Sans Mono Typesetting software: XƎLATEX

Language Science Press Unter den Linden 6 10099 Berlin, Germany langsci-press.org

Storage and cataloguing done by FU Berlin

## **Contents**


### Contents


Contents


## **Preface**

The present volume, *Advances in Formal Slavic Linguistics 2016*, marks a delectable double premiere: It initiates both the book series *Open Slavic Linguistics* as a whole, and its sub-series of collective volumes on formal Slavic linguistics.

*Open Slavic Linguistics* aims at publishing high quality books with a focus on Slavic languages on the empirical side, which at the same time reflect the state of the art and current developments in general linguistics. Its core principles are strict adherence to a genuine Open Access policy and to quality control through double-blind peer review. The series takes a broad linguistic perspective and invites monographs and topical collective volumes from virtually all subdisciplines. This may include theoretically oriented work on Slavic linguistic phenomena, advanced empirical/experimental work on Slavic languages, as well as handbooks, introductions and companions to the linguistic analysis of a given language. The defining characteristics of the series is that it seeks a solid grounding in up-todate theoretical and empirical methods, fosters mutual understanding of linguists across object languages and subdisciplines, and seeks to contribute both to narrowly defined Slavic linguistics and to general linguistics and linguistic typology.

*Advances in Formal Slavic Linguistics 2016* presents a selection of high quality papers authored by young and senior linguists from around the world and contains both empirically oriented work, underpinned by up-to-date experimental methods, and more theoretically based contributions. The volume covers all major linguistic areas, including morphosyntax, semantics, pragmatics, phonology, and their mutual interfaces. The particular topics discussed range from argument structure, word order, case, agreement, tense, aspect, and the left clausal periphery to segmental phonology. The thematic breadth and analytical depth of the contributions reflect the vitality of the field of formal Slavic linguistics and testify to its relevance for the global linguistic endeavor.

Early versions of the papers included in this volume were presented at the conference on Formal Description of Slavic Languages 12 or at the satellite Workshop on Formal and Experimental Semantics and Pragmatics, which were held in Berlin on 7–10 December 2016 – the year referred to in the title of the volume. Half of the submitted abstracts made it into the 44 presentations of the

### Preface

conference. The 21 papers in the present volume were developed from these contributions in the course of a further thorough reviewing process. Neither the original conference nor the present volume would have been possible without the readiness of so many experts to devote their time and thoughts to the critical evaluation and helpful commenting of their colleagues' research papers. We wish to express our gratitude both to the 75 anonymous reviewers of the original conference abstracts, and to the more than 50 external reviewers for the present volume. Their commitment testifies to the liveliness and ambition of the field of Slavic linguistics. This book would have also been impossible without our student assistants, Bella Badt, Justina Bojarski, Andrei Koniaev and Jake Walsh, and the invaluable help of the Language Science Press editors Sebastian Nordhoff and Felix Kopecky. We gratefully acknowledge their efforts and support. Finally, we would like to acknowledge the authors themselves. Open Access publishing is a collective endeavor and we appreciate the authors' willingness to collaborate with us closely not just on linguistic and scientific issues, but also on editorial matters. We sincerely hope that the authors and readers of this volume will share our conviction that it has been worthwhile.

> Denisa Lenertová, Roland Meyer, Radek Šimík & Luka Szucsich Berlin, 14 December 2018

### **Chapter 1**

## **Doubly filled COMP in Czech and Slovenian interrogatives**

### Julia Bacskai-Atkari

University of Potsdam

This article investigates the syntax of doubly filled COMP patterns in Czech and Slovenian interrogatives from a cross-linguistic perspective, concentrating on the differences between Germanic and Slavic doubly Filled COMP. In Germanic, dialects that allow the doubly filled COMP pattern do so to lexicalize a C head specified as [fin] with overt material, which is regularly carried out by verb movement in main clauses (e.g. V2 in German, T-to-C in English interrogatives) and by the interrogative complementizer in embedded polar questions. The insertion of the complementizer has no interpretive effect on the clause and is restricted to embedded clauses. By contrast, in Czech and Slovenian a complementizer can be inserted even in main clauses, and while its presence is optional, its insertion triggers an interpretive difference, resulting in an echo reading. I argue that while in Germanic, the C head is specified as [wh] and is checked off by the wh-element, in Slavic the C is not specified as [wh] and the type of the clause hence matches the properties of the inserted declarative head. In turn, the wh-element moves because it is focused: echo questions are closer to focus constructions than to ordinary questions.

**Keywords:** complementizer, doubly filled COMP, echo questions, finiteness, interrogative clause, wh-movement

### **1 Introduction**

Doubly filled COMP patterns and especially their absence from the standard varieties are well known in the literature on West-Germanic languages.<sup>1</sup> In order

<sup>1</sup>The West-Germanic languages to be discussed here include English, German, and Dutch. Note that there have been claims in the literature, notably by Emonds & Faarlund (2014) that English is not a West-Germanic but a North-Germanic language. However, as shown convincingly by Bech & Walkden (2016), this claim has serious problems and it cannot be maintained.

Julia Bacskai-Atkari. 2018. Doubly filled COMP in Czech and Slovenian interrogatives. In Denisa Lenertová, Roland Meyer, Radek Šimík & Luka Szucsich (eds.), *Advances in formal Slavic linguistics 2016*, 1–23. Berlin: Language Science Press. DOI:10.5281/zenodo.2545509

### Julia Bacskai-Atkari

to illustrate the phenomenon, consider first the following interrogatives from Standard English:

	- b. **Did** she buy a book?
	- c. I don't know **which book (\*that)** she bought.
	- d. I don't know **if** she bought a book.

The ban on the insertion of *that* in (1c) is traditionally referred to as the "doubly filled COMP filter", which is supposed to prohibit lexical material in both the specifier and the head of the same XP projection (Chomsky & Lasnik 1977: 446, see also Koopman 2000). Hence, the wh-element *which book* cannot co-occur with the complementizer*that* in embedded constituent questions. The same issue does not arise in embedded polar questions containing *if*, since the interrogative marker is the complementizer in these cases: the impossibility of the sequence *if that* follows from the two elements being in complementary distribution and need not be accounted for by an additional filter rule.

One problem that arises with the doubly filled COMP filter as a general rule is that it is not obeyed in main clause constituent questions. As can be seen in (1a) and (1b), the verb moves up to C in main clause questions in English (and more generally in Germanic), and this results in the co-occurrence of an overt wh-element in SpecCP with the verb in C in main clause constituent questions, see (1a). While one could in principle argue that main clause questions with verb movement are subject to different requirements, another problem arises in connection with various non-standard dialects (as indicated by van Gelderen 2009, Bayer 2004 and Bayer & Brandner 2008, such dialects are found across West Germanic without a very clear geographical restriction), which show clear violations of the doubly filled COMP filter (cf. the data in Baltin 2010):

(2) I don't know **which book that** she bought.

As can be seen, the co-occurrence of the wh-phrase and *that* is allowed in the non-standard pattern; this is attested across Germanic. This obviously raises the question why doubly filled COMP patterns arise in Germanic and, if applicable, cross-linguistically.

In this article, I propose the following. First, doubly filled COMP patterns in Germanic arise when a finite complementizer is inserted in addition to a whelement in SpecCP and the complementizer serves to lexicalize [fin] in C. In principle, lexicalization can be carried out by other elements, too (such as verbs in main clauses), and the insertion of *that* causes no interpretive differences compared to *that*-less interrogatives. I argue that the lexicalization requirement on

### 1 Doubly filled COMP in Czech and Slovenian interrogatives

[fin] is more generally attested in the syntactic paradigm and is related to V2 and to T-to-C movement. Second, there is no such lexicalization requirement in Slavic languages and the insertion of a complementizer causes an interpretive difference (namely, the clause is interpreted as an echo). I argue that this difference is related to syntactic features as well: while wh-movement in Germanic doubly filled COMP structures is driven by a [wh] feature on the C head, there is no such feature on C in Slavic doubly filled COMP structures.

### **2 Doubly filled COMP in Germanic**

I adopt the general idea of Bacskai-Atkari (2018a), according to which a C with [fin] specification is regularly lexicalized in Germanic, with some inter-language variation. English is somewhat exceptional as it is not a V2 language: the lexicalization rule applies to interrogatives and is manifest in the phenomenon of T-to-C movement. In German, it applies to declaratives as well and results in the matrix V2 configurations. Consider the following matrix interrogatives in English:

	- b. **Did** she buy a book?

The corresponding structures are shown in (4) below:

In either case, the C head is lexicalized by way of the verb moving up to C via head adjunction, and the SpecCP position is filled by an operator element. Note that there is a distinction between [wh] and [Q], following the idea of Bayer (2004), whereby [Q] essentially stands for disjunction; wh-elements are [Q] but not all elements with a [Q] specification are [wh] (see Bacskai-Atkari 2018a for [Q] in Germanic). Further, the operator in (4b) is a covert polar operator. The polar operator can in principle be overt (e.g. English *whether*) or covert, and it

### Julia Bacskai-Atkari

marks the scope of a covert *or* (Larson 1985). This operator is inserted directly into SpecCP (Bianchi & Cruschina 2016).

Consider now the following English embedded interrogatives:

	- b. I don't know **if** she bought a book.

The corresponding structures are shown in (6):<sup>2</sup>

The interrogative feature has to be marked overtly in embedded questions (there being no distinctive interrogative intonation) and it is done either by an overt complementizer or by an overt operator. Accordingly, the interrogative feature on C can be checked off by inserting an element into C (*if* ) or by inserting an element into the specifier (*which book* in (6a) above). By contrast, [fin] can be lexicalized only by an element inserted into C (*that* and *if* in (6) above, but not by e.g. *which book* in the specifier).

Regarding the lexicalization of [fin] in C, the following can be established. In matrix clauses, as shown in (4), [fin] in C is lexicalized via verb movement,

<sup>2</sup>Contrary to Baltin (2010), I assume that doubly filled COMP structures are literally doubly filled COMP, that is, there is only a single CP involved; see Bacskai-Atkari (2018b) for arguments on this. Essentially, Baltin (2010) assumes that the ban on overt material in C in sluiced clauses (Merchant 2001) follows directly from the fact that the ellipsis position is located in the highest C head, eliding the complementizer in a lower C position. However, this is in fact not a sound argument since the lack of a complementizer in these cases can be due to phonological factors as well (the complementizer cliticising onto the clause in the languages he examined), which may indeed be subject to cross-linguistic variation. In Slovenian, for instance, wh-sluices can contain a complementizer (e.g. *da* 'that' but apparently also *če* 'if'), see Marušič et al. (2015), indicating that the generalization does not hold. Note that the Slovenian data contradict the judgements given by Merchant (2001: 76), who suggests that while doubly filled COMP patterns are possible in Slovenian in the same way they are attested in other languages (see, for instance, the Danish and Irish data given by Merchant 2001: 76–77), the sluiced version of doubly filled COMP clauses (containing an overt complementizer) is uniformly rejected.

### 1 Doubly filled COMP in Czech and Slovenian interrogatives

whereby the verb adjoins to C (head adjunction). In embedded clauses, a complementizer is inserted:<sup>3</sup> there are two possible ways here. One is to insert an interrogative complementizer, see (6b), which also checks off the [Q] feature. Further, the insertion of the regular finite subordinator is possible if [wh] is checked off by an overt operator, hence in structures like (6a): this option can be observed in nonstandard varieties. Since, as the structures above demonstrated, lexicalization of [fin] in C is generally attested in the syntactic paradigm, standard varieties in West Germanic have an exception in (6a) by not lexicalising the C head,<sup>4</sup> while nonstandard varieties are completely regular in this respect. Note that the insertion of an interrogative complementizer is not a viable option in cases like (6a) since the insertion of the complementizer would check off the active interrogative feature on the C head,<sup>5</sup> and hence there would be no feature attracting the wh-element to move to the CP (since [Q] is a subset of [wh], an interrogative complementizer would not be incompatible with the feature specification of the head) and thus prevent the movement of the wh-element.

The insertion of the complementizer is thus in line with the general V2 property of Germanic languages and with T-to-C movement in English interrogatives. Further, the insertion of the finite complementizer causes no interpretive difference, and several dialects show optionality with respect to the insertion of the complementizer.<sup>6</sup>

<sup>3</sup>While [fin] is lexicalized by verb movement in main clauses, this is generally not possible in embedded clauses: certain verbs in German allow embedded V2 and there are certain dependent clauses (such as hypothetical comparatives and conditionals) that likewise allow verb fronting. As argued by Bacskai-Atkari (2018a), this is due to restrictions from the matrix predicate.

<sup>4</sup>According to Bacskai-Atkari (2018a), this has to do with licensing conditions on zero complementizers (i.e., they are licensed in these environments in the standard language). In addition, the "doubly filled COMP filter" is rather the consequence of an economy principle against multiple elements with overlapping functions, which interacts with a principle favouring overt marking, see van Gelderen (2009). This question cannot be examined here in detail.

<sup>5</sup>The C head is specified as [wh] and the complementizer has the feature [Q]. The two features are not fully incompatible, though, as [Q] is a subset of [wh] (cf. Bayer 2004). The problem with inserting the complementizer is the deactivation of the feature, as described above, not feature incompatibility.

<sup>6</sup>Optionality arises in certain dialects with head-sized wh-phrases that may be inserted into either the specifier or the head, see Bacskai-Atkari (2018b), following Bayer & Brandner (2008). Not all dialects have optionality, though. As there is no interpretive difference between configurations with and without the complementizer, it is actually expected that at least some dialects show optionality; note that while optionality is considered to be problematic for minimalist approaches, dialect data and diachronic data in fact support the view that at least some optionality is allowed in language, to allow gradual variation and change. These issues cannot be pursued here in detail.

### Julia Bacskai-Atkari

Doubling is possible in polar interrogatives as well if the operator is overt. In English, the operator *whether* can appear in embedded clauses overtly and doubling with *that* can be observed both historically and synchronically (see van Gelderen 2009 for modern substandard varieties); in main clauses, its appearance is restricted to historical examples.<sup>7</sup> Consider:

	- b. I wot not **whether that** I may come with him or not. 'I do not know whether I may come with him or not.'

(*Paston Letters* XXXI)

As can be seen, *whether* is similar to ordinary wh-operators in triggering verb movement to C in main clauses and in allowing the insertion of *that* in embedded clauses; hence, its behaviour contrasts with that of *if*. Importantly, just like in constituent questions, there is no interpretive difference between the version with *that* and the version without *that* of the same sentence.

Regarding the separation of [wh] and [Q] mentioned above, it must be mentioned that the co-occurrence of two interrogative elements is possible in certain languages (Bayer 2004). This can be observed in Dutch dialects in examples like (8) below:

(8) Ze she weet knows **wie** who **of** if **dat** that hij he had had willen want opbellen call 'She knows who he wanted to call.'

(Bayer 2004: 66, ex. 17, citing Hoekstra 1993)

As can be seen, in this case three overt elements appear in the CP-domain: the wh-operator itself, the Q-element *of* 'if' and the finite complementizer *dat* 'that'. Again, no interpretive difference can be attributed to the insertion of multiple elements: clauses with the combination *wie dat* 'who that' and clauses with a single *wie* 'who' have the same interpretation, too. The structure for the CP-domain in (8) is shown below:

<sup>7</sup>As mentioned above, verb movement to C in embedded clauses is subject to restrictions (due to the matrix predicate).

1 Doubly filled COMP in Czech and Slovenian interrogatives

The polar operator is in the scope of a wh-operator, and the clause is ultimately specified as [wh]: hence, even if the Q-element *of* is inserted into the lowest SpecCP, [wh] is not checked off and the CP projects further (essentially, the [wh] feature of the lower C is inherited by the higher C).

To conclude this section, it can be established that doubly filled COMP patterns in Germanic interrogatives follow from a requirement on lexicalising [fin] on C, which ultimately follows from the V2 property of Germanic languages, whereby English is slightly exceptional in that V2 is no longer attested, but the same applies to T-to-C movement in interrogatives. The expectation is therefore that genuine doubly filled COMP patterns should be different or not available in languages where there is no lexicalization requirement on [fin] in main clause interrogatives.<sup>8</sup>

### **3 Czech**

In this section, I am going to overview the possible patterns in Czech main and embedded questions. I will show that doubling is possible, yet while the resulting combinations are in part surface-similar to their Germanic counterparts, they are associated with a particular (echo) interpretation.

<sup>8</sup>Note that while V2 (or T-to-C) is probably necessary for genuine doubly filled COMP, it is not true the other way round: it is indeed possible that the lexicalization of [fin] does not hold in all constructions and a language may be V2 without showing doubly filled COMP effects: for instance, Standard German (and any variety of German lacking doubly filled COMP patterns) is such a language.

### Julia Bacskai-Atkari

Just like in English, constituent questions in Czech contain an overt wh-element fronted to the left edge of the clause:<sup>9</sup>

	- b. Ptala asked.3sg.f se, refl **kdo** who přijel. arrived.3sg 'She asked who arrived.'

I assume that the wh-element moves to SpecCP, following Rudin (1988) and Kaspar (2015).

Regarding doubly filled COMP patterns, the insertion of *že* 'that' is possible. However, this results in an interpretive difference from ordinary questions and essentially renders echo questions where the speaker asks for the value of the wh-element<sup>10</sup> (see Kaspar 2015, Gruet-Skrabalova 2011):

	- b. ? Ptala asked.sg.f se, refl **kdo** who **že** that přijel. arrived.sg.m 'She asked who was said to have arrived.'

<sup>9</sup>Note that I am only considering questions involving a single wh-phrase in this paper and do not venture to examine multiple wh-fronting. As argued by Bošković (2012), multiple whquestions actually involve the movement of a single wh-phrase due to a [wh] feature, and the remaining wh-elements are either located in situ or are fronted as focused phrases: crucially, the CP does not contain multiple [wh] features attracting various wh-elements. See also Gruet-Skrabalova (2011) on Czech and Mišmaš (2016) on Slovenian. In this sense, further wh-phrases and their position in the clause are not relevant to the present discussion, which is centred on clause-typing issues.

<sup>10</sup>As Jiri Kaspar (p.c.) informs me, constituent questions with *že* can be interpreted as canonical echo questions (where the value of the wh-element was inaudible), reminder questions (the speaker has forgotten the value), verification questions (the speaker is unsure about the value), and surprise questions (the speaker assumes a different value). Since all these types have been subsumed under the umbrella term "echo questions" in the literature, as opposed to ordinary questions, I will simply use the label "echo questions" in this paper but it should kept in mind that this term subsumes various subtypes (this applies to the Slovenian data, too).

### 1 Doubly filled COMP in Czech and Slovenian interrogatives

The sentence in (11a) is an appropriate reaction to a statement such as 'Peter arrived'. The sentence in (11b) is the embedded version thereof; its markedness stems from the fact that it is relatively difficult to find contexts in which an embedded echo is felicitous. As far as the status of *že* is concerned, I follow Kaspar (2015) in assuming that this element is located in C;<sup>11</sup> hence, its co-occurrence with the wh-element in SpecCP makes the doubly filled COMP effect possible.

Consider now the following polar questions:

	- b. Ptala asked.sg.f se, refl **jestli** if Marie Mary přijela. arrived.sg.f 'She asked if Mary arrived.'

As can be seen, the embedded polar question in (12b) is introduced by *jestli* 'if', while its matrix interrogative counterpart in (12a) has no morphophonological marker.

The insertion of *že* 'that' into clauses with *jestli* is impossible:

(13) \* Ptala asked.sg.f se, refl **jestli** if **že** that Marie Mary přijela. arrived.sg.f 'She asked if Mary arrived.'

The elements *že* and *jestli* are in complementary distribution regarding their syntactic position (but not their function12); hence, since *že* is in C, it can be concluded that *jestli* is in C, too. This is in line with the etymology of *jestli*, a grammaticalized form of the question particle *li* and the verb 'be': in Czech, if C is filled by the clitic -*li*, the verb moves up to C to host the clitic (Schwabe 2004).

In addition to the constructions so far, it should be mentioned that wh-elements may appear in polar questions headed by *jestli*, rendering an echo reading:

(i) \* Ptala asked.sg.f se, refl **že** if Marie Mary přijela. arrived.sg.f 'She asked if Mary arrived.'

<sup>11</sup>As Kaspar (2015) shows, there is in fact more than one *že* element in Czech, see also Gruet-Skrabalova (2012); I will only concentrate on the declarative complementizer appearing in the clauses under scrutiny.

<sup>12</sup>This means that while they occupy the same position, C, in syntax, they do not have the same distribution and *že* cannot introduce questions by itself:

Julia Bacskai-Atkari

	- b. \* Ptala asked.sg.f se, refl **kdo** who **jestli** if přijel. arrived.sg.m 'She asked about whom the question arose whether they arrived.'

The sentence in (14a) is an appropriate reaction to a question such as 'Did Peter arrive?', and hence is an echo of a polar question.<sup>13</sup> As can be expected, the insertion of *že* 'that' is again impossible:<sup>14</sup>

	- b. \* Ptala asked.sg.f se, refl **kdo** who **jestli** if **že** that přijel. arrived.sg.m 'She asked about whom the question arose whether they arrived.'

Regarding the interrogative patterns in Czech, the following points can be established. First, doubly filled COMP effects are possible with *že* 'that' and with *jestli* 'if': both render echo questions (though these echo questions are licensed in two different kinds of context) and the elements *že* and *jestli* cannot occur together. Second, the insertion of the complementizer (in addition to the element in the specifier) is not attested in ordinary constituent questions. Third, the insertion of either complementizer (in addition to the wh-element) triggers an echo interpretation. Fourth, the complementizer is available in main clause echo questions, contrary to ordinary main clause questions, and in this way the echoed statement/question is surface-similar to an embedded clause, in line with the fact that it is dependent on a particular context in order to be felicitous.<sup>15</sup> This is contrary to what was seen in Germanic, where no echo interpretation is attested and

<sup>13</sup>The impossibility of embedding such an echo, as in (14b), may well have pragmatic reasons, i.e. such a sentence is not felicitous in any context. Note that if the Czech pattern were an ordinary doubly filled COMP pattern, such as in (substandard) West Germanic, then (14b) should be grammatical and (14a) should be ruled out.

<sup>14</sup>Note that the impossibility of the combinations discussed in this paper is not merely due to their relative order: changing their relative order (e.g. *že jestli*) results in an ungrammatical configuration, too.

<sup>15</sup>Note that there are other instances of subordinating C-elements appearing in main clauses, as is the case for German *ob* 'if' in V-final main clause questions that are pragmatically distinct from ordinary questions, see e.g. Zimmermann (2013). Naturally, the discussion of this issue would go far beyond the scope of the present paper.

### 1 Doubly filled COMP in Czech and Slovenian interrogatives

where complementizers are not inserted in main clause constituent questions. Fifth, the patterns in Czech suggest that the clause type reflects the properties of the complementizer, not those of the wh-element (see the discussion in §5); this is again contrary to Germanic, where the presence of a wh-element indicates that the clause is a true interrogative.

### **4 Slovenian**

This section is going to overview the possible patterns in Slovenian main and embedded questions. I will show that doubling is possible in similar ways to what was attested in Czech; again, the resulting combinations are in part surfacesimilar to their Germanic counterparts, yet they are associated with a particular (echo) interpretation.

Just like in English and Czech, constituent questions in Slovenian contain an overt wh-element fronted to the left edge of the clause:


I follow Golden (1997) and Hladnik (2010) in assuming that the wh-element moves to SpecCP.

Just like in Czech, the insertion of *da* 'that' is possible; this renders echo questions (see Hladnik 2010):<sup>17</sup>

(i) Deček, boy katerega that sem aux.1sg srečal met včeraj, yesterday me me je aux.3sg prepoznal. recognized 'The boy that I met yesterday, recognized me.' (Marušič 2008b: 266)

<sup>16</sup>As noted, the data are essentially taken from Hladnik (2010); however, the translations have been changed in accordance with what my informants gave as more natural translations.

<sup>17</sup>Just like in the Czech examples, the verb immediately follows the wh-element; however, this is not an effect of V2 in either language. In Slovenian, certain clitics, including auxiliaries, appear in a second position, as in (i):

As can be seen, the clitic *je* follows the element *me*, and is hence the second element in the clause. However, as shown by (ii), it appears that *je* can follow both *kdo* and *da* in doubly filled COMP patterns:

Julia Bacskai-Atkari

(17) a. **Kdo** who **da** that pride? comes

'WHO is coming?' (Hladnik 2010: 13, ex. 9)

b. ? Vprašal asked.sg.m je, aux.3sg **kdo** who **da** that pride. comes 'He asked who was said to be coming.'

(based on Hladnik 2010: 14, ex. 11)

The sentence in (17a) is an appropriate reaction to a statement such as 'Peter is coming'; the sentence in (17b) shows the embedded version and is marked for pragmatic reasons, just as was the case for its Czech counterpart. Regarding the status of *da*, I follow Hladnik (2010) in assuming that it is located in C; hence, when appearing together with a wh-element, (surface) doubly filled COMP effects are possible.<sup>18</sup>

Since *je* appears after the elements *kdo* and *da*, one might wonder whether *kdo da* is a constituent or whether *kdo* is in a higher clause. However, both options are unlikely: an element in the specifier cannot form a constituent with the C head, and postulating a higher clause to locate a single element would be highly problematic, too. I assume that *kdo* is in SpecCP and *da* in the C head of the same CP, whereby the two elements neither form a constituent nor are they located in different clauses. There is in fact no need to assume a strict surface secondposition requirement on Slovenian clitics. As shown by Marušič (2008b), analyses assuming a fixed syntactic position such as C for clitics, as by Golden & Sheppard (2000), face a number of problems and the relative position of the clitic should rather be considered phonological in nature (in line with general "Wackernagel" phenomena). In this case, the clitic naturally follows the element in the C head even if the specifier of the CP is filled by some additional element since there is no way of inserting the clitic in between the element in the specifier and the element in the head of a single CP projection. If the wh-element and the complementizer were located in separate projections, one might expect the clitic to intrude, which is not the case. Note that, strictly speaking, the same holds even if one assumes a fixed syntactic position for the clitic (a projection below CP or another CP, resulting in a split CP) since the filling of the specifier in a higher projection does not influence the realization of the clitic in some lower projection.

<sup>18</sup>Again, one might wonder whether the wh-element is indeed in the same CP as the complementizer *da*. In Slovenian, a null complementizer is licensed only if the wh-element is in the relevant specifier: it is not possible if the wh-phrase undergoes long distance movement, and in these cases *da* is inserted, see Golden (1997), Marušič (2008a). Hence, one might think that the doubly filled COMP effect in echo questions arises merely because the complementizer has to be overt if the wh-element is in a higher clause. However, as shown by Mišmaš (to appear), echo questions in Slovenian are in fact possible even without *da*, which indicates that the wh-element does not move out of the clause where it is base-generated.

<sup>(</sup>ii) Kdo who da that je aux.3sg prišel? come.sg.m 'WHO came?'

### 1 Doubly filled COMP in Czech and Slovenian interrogatives

Consider now the following polar questions:<sup>19</sup>


As can be seen, a question particle – *a* or *če* – is licensed both in main clause and in embedded interrogatives. The insertion of *da* 'that' is possible in both cases and it renders an echo reading (cf. Hladnik 2010):


The sentence in (19a) is an appropriate reaction to a statement such as 'He is coming'. Importantly, *da* and *a*/*če* are not in complementary distribution, which suggests that *a*/*če* are not in C, contrary to Czech *jestli*. Instead, in the given constructions they are rather operators located in SpecCP, similarly to English *whether*. 20

Finally, it must be mentioned that wh-elements may appear in polar questions; this renders an echo interpretation, similarly to what was observed in Czech. Note that the acceptability of these constructions in Slovenian is dependent on the dialect/idiolect, as also indicated by Hladnik (2010) in connection with all the doubling patterns, and this seems to be especially true in the case of the triple

<sup>19</sup>Again, I cannot examine the distribution of *a* and *če* beyond the constructions under scrutiny and will discuss only the differences within the given syntactic paradigm.

<sup>20</sup>Note that *če* can appear in conditional clauses, too; however, the discussion of this falls outside the scope of the present paper.

### Julia Bacskai-Atkari

combination in (20b) below (this was not accepted as grammatical by my main informant).<sup>21</sup> Consider the following examples:


The sentence in (20a) is an appropriate reaction to a question such as 'Is Peter coming?', and the sentence in (20b) is an appropriate reaction to a question such as 'Is it true that Peter is coming?'. Crucially, in both sentences in (20), the Qelement is *če* and not *a*, as opposed to ordinary main clause interrogatives.<sup>22</sup> This indicates that the difference from ordinary questions is encoded morphosyntactically, too.<sup>23</sup>

Regarding the interrogative patterns in Slovenian, the following points can be established. First, doubly filled COMP effects are possible with *da* 'that' and *a*/*če* 'if'. Second, the complementizer (in addition to the element in the specifier) is not inserted in ordinary constituent questions and may be inserted in ordinary polar questions. Third, the insertion of either complementizer (in addition to the wh-element or the Q particle in the specifier) triggers an echo interpretation. Unlike Czech, the echo of a question (a "double echo" in Hladnik 2010) is

<sup>21</sup>Unfortunately, since the focus of Hladnik (2010) is relative clauses, the exact geographical distribution of the interrogative patterns cannot be recovered from his thesis, and it remains unclear whether the acceptability of (20) shows relatively clear regional differences or whether the differences hold rather between idiolects. As Hladnik (2010: 6–8) describes in the introduction, he conducted a larger pilot study of Slovenian dialects, whereby the focus was on syntactic doubling and on variation in dialects. Altogether, over 70 responses were collected from 55 test locations; further, since Slovenian speakers acquire a regional dialect as a rule, the data are quite reliable in that they reflect regional varieties rather than the standard language.

<sup>22</sup>As one of the reviewers informs me, this is true also if the clause is sluiced: the element *kdo* can be followed by *če* but not by *a*. This is expected if sluiced clauses are derived from regular interrogatives. Note also that in cases like (20a), the wh-element may remain in situ, in line with the assumption that the movement involved here is not genuine wh-movement but rather focusing (which preferably involves fronting); see the discussion in §5.

<sup>23</sup>As was noted before, certain contexts license clauses that are surface-similar to ordinary embedded clauses, such as matrix questions with *ob* 'if' in German. The pattern in (20) again indicates that the particular echo constructions are discourse-dependent and cannot appear in the same environments as ordinary main clause questions.

### 1 Doubly filled COMP in Czech and Slovenian interrogatives


Table 1: Clause typing and Germanic doubly filled COMP

possible in Slovenian (at least dialectally, see (20) above). Fourth, the complementizer is available in main clause echo questions, contrary to ordinary main clause questions, and in this way the echoed statement/question is surface-similar to an embedded clause, in line with the fact that it is dependent on a particular context in order to be felicitous. This is similar to Czech and contrary to what was seen in Germanic, where no echo interpretation is attested and where complementizers are not inserted in main clause constituent questions. Fifth, the patterns in Slovenian, just like in Czech, suggest that the clause type reflects the properties of the complementizer, not those of the wh-element (see §5); this is again contrary to Germanic, where the presence of a wh-element indicates that the clause is a true interrogative.

### **5 The analysis**

The present paper investigates various patterns involving wh-elements, Q elements and finite subordinators in Germanic and in Slavic languages. In this section, I am going to overview the behaviour of these combinations first.

The combinations observed in Germanic are given in Table 1; these combinations are attested in embedded clauses only.

As can be seen, the type of the clause always matches the leftmost element in the linear sequences. That is, once a wh-element is inserted, the clause can only be a constituent question. If there is no wh-element but a Q element is present, the clause can only be a polar interrogative. Naturally, a clause is always typed by the C head but certain features on the C head are checked off by elements moving to the specifier, as in wh-questions (yet the wh-elements do not themselves type the clause).

### Julia Bacskai-Atkari


Table 2: Clause typing and Slavic doubly filled COMP

The combinations observed in Slavic (Czech and Slovenian) are given in Table 2; these combinations are attested both in embedded and in matrix clauses. As indicated, the type of the clause always matches the rightmost element in the linear sequences, contrary to the Germanic pattern. That is, once the finite complementizer is inserted, the clause is typed as a declarative, but the presence of the interrogative elements leads to an echo interpretation. Consequently, there is a split between form and function that is not attested in Germanic. If there is no finite complementizer but a Q element is present, the clause is a polar interrogative, but the presence of the wh-element leads to an echo interpretation. Again, a clause is always typed by the C head but the Slavic pattern is crucial because the insertion of an operator into the specifier does not involve feature checking with the head: the C head lacks the features associated with the operator. Ordinary questions are possible only when a single interrogative element is present.

Regarding Germanic doubly filled COMP patterns, the following can be established. On the one hand, the movement of the wh-operator or the insertion of the polar operator into SpecCP take place for clause-typing reasons and can be thus drawn back to question semantics and to the requirement on feature checking with C. On the other hand, the insertion of the finite complementizer takes place in order to lexicalize [fin] in C.

By contrast, regarding Slavic doubly filled COMP patterns, the following can be established. On the one hand, the insertion of the operator (either a wh-operator or the polar operator) into SpecCP takes place due to an [edge] feature on the C head containing the elements introducing the echoed question, and there is no feature checking with C (given that there is no interrogative feature to be checked, as echo questions are not typed as interrogatives, see Bošković 2002:

### 1 Doubly filled COMP in Czech and Slovenian interrogatives

363).<sup>24</sup> On the other hand, the insertion of the complementizers into C takes place because they type the echoed clause.

As far as echo questions are concerned, I assume that they are not true questions and are closer to focus constructions (cf. Bošković 2002, Artstein 2002). This is in line with the analysis of Bošković (2002), who claims that the fronting of echoed wh-phrases, as well as that of non-first wh-phrases in multiple fronting constructions, are independent of a strong [wh] feature on C. Accordingly, Bošković (2002: 359–364) analyses the relevant constructions as instances of focus fronting. Hence, the interrogative interpretation arises locally, similarly to English, where there is no wh-movement in echo questions, indicating that there is no [wh] feature on the C head (cf. Bošković 2002: 363).

We saw earlier that Slavic languages may allow embedded echo questions, even though these configurations are marked compared to their matrix counterparts. That is, the clause can be taken by a predicate taking interrogative complements (e.g. *ask*), which is normally possible if the clause is typed as [wh]. I assume that in echo clauses this is related to feature percolation: namely, the features of the element in the specifier can percolate up and hence the interrogative property, which is interpretable on the wh-element itself, is visible to the matrix predicate.<sup>25</sup> However, there is no percolation downwards, and hence the echoed clause itself is not affected.

Consider now the structures for WH FIN sequences in Germanic (here: English) and Slavic (here: Czech), respectively:

<sup>24</sup>Note that the WH Q sequence is special in this respect because the clause is typed as a polar interrogative by the Q-element, just as the declarative clause is typed as declarative by the relevant element in C. However, this configuration is also regular in the sense that the wh-element itself does not type the clause. Importantly, there is no incompatibility between an interrogative clause type and an echo reading, provided that the interrogative is typed independently of the echoing wh-phrase.

<sup>25</sup>The idea of feature percolation is well known in the syntactic literature and is subject to debates concerning its exact application and restrictions. As described by Heck (2008: 5–7), pied-piping has been treated in terms of feature percolation of the wh-feature since Chomsky (1973: 273), whereby the wh-feature projects to the DP-level and then percolates up to the PP level, that is, it is allowed to cross a phrase boundary. Essentially the same is proposed here in terms of the wh-feature percolating up to the CP, without causing changes in the C head itself (just like in the case of PPs, where feature percolation does not change the properties of the P). Naturally, this again raises the question how far a feature is allowed to percolate, the discussion of which clearly cannot be carried out in the present paper.

Julia Bacskai-Atkari

As can be seen, both configurations result in a doubly filled COMP pattern. However, the C is specified as [wh] only in (21a), which is a true interrogative, while the Slavic pattern in (21b) is an echo question. The complementizer is inserted in certain dialects in Germanic to lexicalize [fin], while Slavic complementizers are inserted to type the clause.<sup>26</sup>

Consider now the structures for Q FIN sequences in Germanic (here: English) and Slavic (here: Slovenian), respectively:

Again, the surface doubling configuration results in doubly filled COMP patterns in both cases. The C is specified as interrogative, this time as [Q] , only in Germanic, see (22a), while in Slavic the question is merely echo, see (22b). Further, the complementizer is inserted in certain dialects in Germanic to lexicalize [fin], while Slavic complementizers are inserted to type the clause.

Finally, consider the structures for WH Q FIN sequences, in Germanic (here: Dutch) and Slavic (here: Slovenian), respectively:

<sup>26</sup>Note that this does not mean that the complementizer is always overt. Declarative complementizers tend to have zero counterparts cross-linguistically and the same applies to e.g. *že* and *da*, too. This means that echo questions are possible without the insertion of an overt *že*, too. This option has not been discussed in detail here because the present paper is devoted to doubling patterns in the CP-domain.

1 Doubly filled COMP in Czech and Slovenian interrogatives

As can be seen, the CP is split in both cases,<sup>27</sup> yet the C head is specified as [wh] only in the Germanic case, see (23a), while the Slovenian configuration represents an echo, see (23b). In (23a), the [wh] feature of the lower C head is not checked off, since the polar operator in SpecCP is merely [Q], a subset of [wh]; hence, the CP projects further. In (23b), there is no feature checking associated with either of the operators; they are inserted to render the echo reading. Again, the finite complementizer is inserted in certain dialects in Germanic to lexicalize [fin], while Slavic complementizers are inserted to type the clause.

The differences between Germanic and Slavic essentially go back to differences in the requirement of lexicalising [fin]: since this requirement is present in Germanic, the finite complementizer is inserted merely due to this requirement, while its appearance in Slavic doubly filled COMP constructions contributes to the echo reading by way of typing the clause merely as [fin] but not as [wh] or [Q].

### **6 Conclusion**

This paper investigated doubly filled COMP effects in Germanic and Slavic (to be more precise, Czech and Slovenian). It was shown that while the two language groups represent similar surface configurations, they differ crucially in the distribution and the interpretation of these structures. In Germanic, doubly filled COMP arises due to a requirement on filling a C head specified as [fin]; this is in line with the general properties of V2 (e.g. in German) and T-to-C (English).

<sup>27</sup>In the model adopted here, based on Bacskai-Atkari (2018b), the CP is split if certain features have to project further to be checked off but there is no predefined cartographic template in the sense of Rizzi (1997). However, the assumption that there can be multiple CPs (similarly to VPs) is widespread in the literature.

Importantly, the insertion of the finite complementizer takes place only in embedded questions and it brings interpretive differences from complementizer-less clauses. In Slavic, doubly filled COMP arises in echo questions and the complementizer is inserted to type the clause, while the element in the specifier does not check off its features with the head. The insertion of the complementizer involves an important interpretive difference from complementizer-less clauses, since the lack of the complementizer is associated with ordinary questions, while the presence of the complementizer triggers an echo interpretation. Taking all this into account, it can be concluded that the differences between Germanic and Slavic doubly filled COMP structures can be accounted for in a principled way.

### **Abbreviations**


### **Acknowledgements**

This research was funded by the German Research Fund (DFG), as part of my project "The syntax of functional left peripheries and its relation to information structure" (BA 5201/1-1). I would like to thank Jiri Kaspar, Mojmír Dočekal and Radek Šimík (Czech) and Moreno Mitrović (Slovenian) for their indispensable help with the data. I also owe many thanks to the audience of FDSL-12, in particular to Petra Mišmaš and Roland Meyer. Finally, I am highly grateful to the reviewers of my paper for their insightful and constructive questions and suggestions.

### **References**


1 Doubly filled COMP in Czech and Slovenian interrogatives


### Julia Bacskai-Atkari


1 Doubly filled COMP in Czech and Slovenian interrogatives

ization. *Grazer Linguistische Studien* 83. 47–65. http://unipub.uni-graz.at/gls/ periodical/titleinfo/1283369.


### **Chapter 2**

## **Russian datives again: On the (im)possibility of the small clause analysis**

Tatiana Bondarenko

Massachusetts Institute of Technology

In this paper I use the interpretation of the repetitive adverb *opjat'* 'again' in Russian to argue that ditransitive structures in this language do not involve a small clause structure (Kayne 1984; Beck & Johnson 2004; a.o.). Under the syntactic approach to the semantics of repetitives that I adopt (von Stechow 1996; Beck 2005; a.o.), the interpretation of repetitives is determined by their attachment in the syntactic representation. I show that in Russian ditransitives, unlike in English ones (Beck & Johnson 2004), only the repetitive reading of 'again' is possible, and argue that no reason other than a difference in the syntactic structures of ditransitives in two languages can account for that. I also observe that unlike datives that are found in ditransitives, "higher" dative arguments and locative applicatives in Russian can occur in constructions where there is a syntactic constituent denoting the resultant state, and thus the restitutive reading of repetitives is available.

**Keywords:** ditransitives, repetitives, datives, small clauses, Russian

### **1 Introduction**

In this paper I will discuss applicability of the small clause analysis (Kayne 1984; Harley 1996; Beck & Johnson 2004; Pylkkänen 2008, among others) that has been proposed for the English double object construction (1) to constructions with dative arguments in Russian (2).<sup>1</sup>

<sup>1</sup>All examples in this paper are either in English or in Russian, unless explicitly indicated otherwise.

Tatiana Bondarenko. 2018. Russian datives again: On the (im)possibility of the small clause analysis. In Denisa Lenertová, Roland Meyer, Radek Šimík & Luka Szucsich (eds.), *Advances in formal Slavic linguistics 2016*, 25–51. Berlin: Language Science Press. DOI:10.5281/zenodo.2545511

### Tatiana Bondarenko


The small clause analysis involves the idea that in ditransitive constructions a direct object and an indirect object are merged together forming a small clause excluding the verb. This idea is shared by a variety of approaches (Kayne 1984; Pesetsky 1995; Harley 1996; 2002; Cuervo 2003; Beck & Johnson 2004; Jung & Miyagawa 2004; McIntyre 2006; Pylkkänen 2008; Schäfer 2008; Lomashvili 2010; Harley & Jung 2015, among others), which diverge on the exact nature of this formation (small clause/low applicative/PP/HaveP) and a few other details of the derivation. The tree in Figure 1 (adapted from Harley 2002) illustrates a version of this analysis for the English double object construction in (1): the direct object (*a letter*) and the indirect object (*Mary*) are combined with the help of a special PHAVE, and the resulting PP becomes a complement of the verb.

Figure 1: Double object construction (adapted from Harley 2002: 4)

The small clause analysis makes use of lexical decomposition in syntax: different subevents of a predicate are represented by different projections in syntax (*v*DO/CAUSP for a causing subevent, SC/ResultP/HaveP/PP for a result state subevent, among some others). Under such approach to the syntax-semantics interface, indirect objects differ with respect to where they are introduced in the syntactically represented lexical decomposition of a given verb (Cuervo 2003; Schäfer 2008; among others). Their positions account for different interpretations and different syntactic properties. Indirect objects in the English double object construction are participants of the result state subevent under the small clause analysis.

### 2 Russian datives again: On the (im)possibility of the small clause analysis

The aim of this paper is to argue that Russian ditransitive verbs like *otdavat'* 'give' in (2) should not be analyzed as involving a small clause structure. While English might decompose ditransitive verbs in syntax (*give* as CAUSE to HAVE), Russian does not exhibit the decomposition of this sort. My argumentation employs the idea that repetitive morphemes like *again* single out subevents in the semantics of a predicate, and thus, are able to detect the exact placement of indirect objects in syntactic structures with lexically decomposed verbs. If an indirect object denotes a participant of some subevent *e*<sup>1</sup> , then it should be in the scope of a repetitive adverb that singles out that subevent *e*<sup>1</sup> . I will try to show that Russian has constructions where a dative argument is a participant of a stative subevent of a predicate, but ditransitive sentences are not among such constructions.

The crucial observation for my proposal is that the restitutive reading of again is available in English ditransitive sentences – in both the double object construction, see (3), and the to-PP construction, see (4), but not in Russian, no matter if the dative argument precedes the accusative one, as in (5), or conversely, see (6).2, <sup>3</sup>

	- a. **Repetitive**: Available 'Thilo gave Satoshi the map, and that had happened before.'
	- b. **Restitutive**: Available 'Thilo gave Satoshi the map, and Satoshi had had the map before.'

(Beck & Johnson 2004: 113)

(4) Thilo gave the map to Satoshi again. to-PP construction

a. **Repetitive**: Available 'Thilo gave Satoshi the map, and that had happened before.'

b. **Restitutive**: Available 'Thilo gave Satoshi the map, and Satoshi had had the map before.' (Beck & Johnson 2004: 116)

<sup>2</sup> I do not want to imply that (5) and (6) are equivalents of English double object construction and *to-*PP construction correspondingly. The sentences in (5)-(6) just show that the availability of the restitutive reading does not depend on the relative word order of dative and accusative arguments in Russian.

<sup>3</sup> I use again to refer to this kind of repetitive adverbs generally and words in italics (English *again,* Russian *opjat'*) to refer to concrete lexical items of languages.

### Tatiana Bondarenko

	- Masha again gave book.acc Vasja.dat a. **Repetitive**: Available
		- 'Masha gave Vasja the book, and that had happened before.'
		- b. **Restitutive**: Unavailable 'Masha gave Vasja the book, and Vasja had had the book before.'

Under the restitutive reading, the subevent that is singled out by again is the state of possession between the indirect object and the direct object. For example, in (3) and (4) it is the reading when a state of Satoshi having the map is being repeated.<sup>4</sup> This reading is impossible for Russian ditransitives: in (5) and (6) again cannot single out the state of Vasja having the book. The example in (7) illustrates that providing more context does not increase the availability of the restitutive reading in Russian ditransitives.

<sup>4</sup>An anonymous reviewer asks whether the presence of the restitutive reading entails the small clause analysis for the PP datives, given the logic of Beck & Johnson (2004). While the analysis for the PP datives is not spelled out in detail in Beck & Johnson (2004), one can infer from the discussion therein that the authors propose distinct syntactic structures for the double object construction and the *to-*PP construction, both of which include a small clause. Given the logic of Beck & Johnson (2004), the double object construction includes a small clause that consists of the two objects merging with the help of a functional projection (XP), which is then combined with the verb. The *to-*PP construction under their view presents a subcase of a more general NP + PP pattern. In sentences of this sort V merges directly with a PP and takes an NP as its specifier. The PP under consideration contains a null PRO as its subject that corefers with the NP that is the specifier of the verb. Thus, as the authors themselves put it, the PP becomes in effect a small clause (Beck & Johnson 2004: 118). In other words, the presence of the restitutive reading in (4) under the logic of Beck & Johnson (2004) does entail the presence of a small clause in the syntactic structure but does not necessarily entail that the syntactic structures of the double object construction and the *to-*PP construction are identical.

### 2 Russian datives again: On the (im)possibility of the small clause analysis

	- a. # I and togda then Maša Masha opjat' again {otdala gave / otpravila sent / vernula} returned Vase Vasja.dat knigu. book.acc Intended: 'And then Masha gave / sent / returned Vasja the book, and Vasja had had the book before.'
	- b. # I and togda then Maša Masha opjat' again {otdala gave / / otpravila sent / / vernula} returned knigu book.acc Vase. Vasja.dat Intended: 'And then Masha gave / sent / returned the book to Vasja, and Vasja had had the book before.'

Why does Russian differ from English with respect to the availability of the restitutive reading in ditransitives? Does this difference reflect different syntactic structures of ditransitive sentences in these languages? Does Russian have constructions with dative arguments where again is able to single out the stative subevent of a predicate? These questions will be central to the forthcoming discussion.

This paper is structured as follows. In §2 I will introduce the syntactic approach to the meaning of again and discuss how the availability of the restitutive reading in English ditransitives argues for the small clause analysis. In §3 I will argue against Russian ditransitives involving a small clause structure. I will consider different potential reasons for the unavailability of the restitutive reading in Russian ditransitive sentences and conclude that it has a syntactic explanation. In §4 I will discuss constructions with higher dative arguments and show that in these sentences the stative subevent can be singled out, but the dative argument is not a participant of it. In §5 I will provide evidence that dative arguments in Russian can in principle be participants of the stative subevent of a predicate and that a construction with locative applicatives exemplifies such a case. §6 concludes the paper.

### Tatiana Bondarenko

### **2 The small clause analysis of ditransitives: Evidence from again**

In this paper I will assume the syntactic approach to the ambiguity of repetitive adverbs (von Stechow 1996; Beck & Johnson 2004; Beck 2005; Alexiadou et al. 2014; Lechner et al. 2015; among others), according to which different readings of again are attributed to different attachments of again in the syntactic representation. Under this approach the semantics of again is taken to be always the same and involve repetition of some event:<sup>5</sup>

	- a. = 1 iff *P*(*e*) ∧ ∃*e* ′ [*e* ′ <<sup>T</sup> *e* ∧ *P*(*e* ′ )]
	- b. = 0 iff ¬*P*(*e*) ∧ ∃*e* ′ [*e* ′ <<sup>T</sup> *e* ∧ *P*(*e* ′ )]
	- c. undefined otherwise

The semantics in (8) states that again takes an event *e* and a property of events *P* as its arguments and returns 1 if the property is true of the event and 0 if the property is not true of the event. The crucial part of again's meaning is a presupposition that there is another event that temporally precedes (<<sup>T</sup> ) the event under consideration of which the property is true. If the presupposition is not met, the meaning of again is undefined. Under the syntactic approach different readings of again arise due to its modification of different subevents in the syntactically represented lexical decomposition: the subevent that is modified by again is understood as being repeated.

Beck & Johnson (2004) claimed that the presence of the two readings of *again* with the double object construction provides support for the small clause analysis of English ditransitives. If ditransitive verbs such as *give* are lexically decomposed into the subevent denoting the action undertaken by an agent (represented in syntax by *v*) and the stative subevent (represented in syntax by a small clause – HaveP), then *again* should be able to attach to both *v*P and HaveP and modify the respective subevents, giving rise to the repetitive-restitutive ambiguity. This expectation is borne out, as we have observed in (3) (repeated here as (9)). The fact that indirect objects are understood as participants of stative subevents of ditransitive verbs suggests that they are inside a small clause that represents

<sup>5</sup>There is a competing semantic approach to the ambiguity of repetitives (Fabricius-Hansen 2001; Jäger & Blutner 2000; among others), according to which different readings of again emerge due to the lexical ambiguity of repetitive morphemes. In this paper I will not discuss the applicability of the semantics approach to the data under consideration.

### 2 Russian datives again: On the (im)possibility of the small clause analysis

a given stative subevent syntactically. The analysis that Beck & Johnson (2004) propose for sentences like (9) is sketched out in (10) and (11) (for the repetitive and the restitutive reading, respectively).<sup>6</sup>

### (9) Thilo gave Satoshi the map again.

a. **Repetitive**

'Thilo gave Satoshi the map, and that had happened before.'

b. **Restitutive**

'Thilo gave Satoshi the map, and Satoshi had had the map before.' (Beck & Johnson 2004: 113)

### (10) **Repetitive reading**

a. **[***v***<sup>P</sup>** [*v*<sup>P</sup> Thilo [give [BECOME [HaveP Satoshi HAVE the map]]]] **again]**

b. *λe* . again ( *e* ) (*λe*<sup>1</sup> . give(*e*1)(Thilo) ∧ ∃*e*2[BECOME(*e*2)(*λe*<sup>3</sup> . HAVE(*e*3)(the map)(Satoshi)) <sup>∧</sup> CAUSE(*e*2)(*e*1)])

c. 'Once more, a giving by Thilo caused Satoshi to come to have the map.'

(Beck & Johnson 2004: 114)

### (11) **Restitutive reading**


In (10) *again* attaches to the *v*P denoting the whole event of Thilo giving Satoshi the map, giving rise to the repetitive interpretation. In (11) *again* attaches to the small clause that denotes the stative event of Satoshi having the map, thus the restitutive reading arises.

For Beck & Johnson (2004) there are no elements CAUSE and BECOME in the syntactic representation of ditransitive sentences. Syntax provides a verb that

<sup>6</sup>Smallcaps in semantic formulas indicate metalinguistic translations of object language. For instance, <sup>J</sup>Satoshi<sup>K</sup> <sup>=</sup> Satoshi. This means that again in semantic formulas equals <sup>J</sup>again<sup>K</sup> (the meaning of the word *again*) and not the cover term for English *again* and Russian *opjat'*, used elsewhere in the body of the paper.

### Tatiana Bondarenko

takes a small clause as its complement, and it's the semantic component that is responsible for introducing components like CAUSE and BECOME that are required for deriving the correct interpretations. It was proposed by von Stechow (1995) (and further employed in Beck & Johnson 2004 and Beck 2005) that the following special semantic principle is at work in structures with small clauses:

(12) **Principle R** If *α* = [<sup>V</sup> *γ* [SC *β*]] and *β* is of type ⟨*s*,*t*⟩ and *γ* is of type ⟨*e*, . . . ⟨*e*, ⟨*s*,*t*⟩⟩⟩ (an *n*-place predicate), then <sup>J</sup>*α*<sup>K</sup> <sup>=</sup> *λx*<sup>1</sup> . . . *λxnλe* . <sup>J</sup>*<sup>γ</sup>* <sup>K</sup>(*e*)(*x*1) . . . (*xn*) <sup>∧</sup> <sup>∃</sup>*e*1[BECOME(*e*1)(J*β*K) ∧ CAUSE(*e*1)(*e*)]. (adapted from Beck 2005: 7)

This principle ensures that a verb (an *n*-place predicate) is properly "glued" with a small clause (a property of events) by inserting CAUSE and BECOME components into the semantics representation.

This line of reasoning (Beck & Johnson 2004), which makes use of the syntactic decomposition of ditransitive verbs into a verb and a small clause and of the syntactic approach to the ambiguity of repetitive morphemes, allows naturally to explain the possible interpretations of English *again* in the double object construction.<sup>7</sup> In the next section I will discuss why a similar logic is not applicable to the case of Russian ditransitives.

### **3 Russian ditransitives: Against the small clause analysis**

There could be potentially different reasons for why restitutive readings are not available in Russian ditransitive clauses. The first hypothesis that I will explore is that the Russian repetitive adverb *opjat'* has different properties than English *again*. It has been observed that not all repetitive morphemes across languages have the ability to access different subevents inside decomposition structures (Rapp & von Stechow 1999; Beck 2005; Alexiadou et al. 2014; Lechner et al. 2015).

<sup>7</sup>There has been another attempt to explain the repetitive-restitutive ambiguity of *again* in the English double object construction by Bruening (2010), who argues for the asymmetrical applicative analysis of English ditransitives: a verb merges with a direct object first, and then the VP combines with an applicative head that introduces an indirect object as its specifier. Unlike under a small clause analysis, under this syntactic analysis the two interpretations of *again* do not fall out for free: special assumptions about verb head movement, object movement and interpretation of copies are required in order to obtain both repetitive and restitutive readings in ditransitive structures.

### 2 Russian datives again: On the (im)possibility of the small clause analysis

For example, the German repetitive adverb *erneut* 'again' cannot have restitutive readings with lexical accomplishment verbs like *öffnen* 'open', unlike another repetitive adverb *wieder* 'again'; see (13) and (14).<sup>8</sup>


b. **Restitutive**: Unavailable '…that Ali Baba opened Sezam, and Sezam had been open before.' (German; adapted from von Stechow 1996: 3)

This variation with respect to the ability of adverbs to single out different subevents in the syntactically represented lexical decomposition of predicates was captured by the Visibility Parameter (Rapp & von Stechow 1999; Beck 2005):

### (15) **The Visibility Parameter for decomposition adverbs** A D(ecomposition)-adverb can/cannot attach to a phrase with a phonetically empty head.

(Rapp & von Stechow 1999 via Beck 2005: 13)

(i) … dass that Maria Maria die the Tür door erneut again öffnete. opened

(German)

	- '…that Maria opened the door, and that had happened before.'

<sup>8</sup>Note that the unavailability of the restitutive reading in (13) cannot be due to its verb form (which is different from the one in (14)), since the use of the same form as in (14) does not lead to the availability of the restitutive reading:

### Tatiana Bondarenko

Under the assumption that lexical accomplishments in (13) and (14) involve a small clause with a null head that corresponds to the stative subevent of the door/Sezam being open, the Visibility Parameter states that the difference between German *wieder* and *erneut* is that the former, but not the latter can attach to a phrase with a phonetically null head, hence only the former can have the restitutive reading in sentences with lexical accomplishments.

The following question can then be asked about Russian *opjat'*: Is it an adverb that can attach to a phrase with a phonetically empty head? It turns out that *opjat'* can single out the stative subevent of lexical accomplishments, see (16) and (17), thus classifying as a decomposition adverb that can "look inside" the decomposition structure and modify subevents that are not expressed by overt phonetic material. *Opjat'* is not different from German *wieder* or English *again* in this respect.

	- a. **Repetitive**: Available 'Vasja opened the door, and that had happened before.'
	- b. **Restitutive**: Available 'Vasja opened the door, and the door had been open before.'
	- a. **Repetitive**: Available 'Vasja emptied the bottle, and that had happened before.'
	- b. **Restitutive**: Available 'Vasja emptied the bottle, and the bottle had been empty before.'
	- a. **Repetitive**: Available 'Ali Baba opened Sezam, and that had happened before.'
	- b. **Restitutive**: Available 'Ali Baba opened Sezam, and Sezam had been open before.'

Note that unlike *wieder* and *again*, Russian *opjat'* occurs preverbally, see (5)–(7), (16), and (17), which does not prevent it from being able to have restitutive read-

### 2 Russian datives again: On the (im)possibility of the small clause analysis

ings see (16) and (17).<sup>9</sup> The fact that *opjat'* generally allows for restitutive readings when it precedes the verb suggests that the word order in (5)–(7) cannot be the reason for the unavailability of restitutive readings in ditransitive clauses. To sum up, it seems highly unlikely that the properties of *opjat'* prevent restitutive readings in Russian ditransitives.

A second hypothesis that I will consider is that restitutive readings are unavailable in Russian ditransitives due to the absence of a stative subevent in semantics of ditransitive verbs. I will argue that this hypothesis is also wrong: ditransitives have a stative subevent in their semantics, which can independently be detected by another Russian adverb, namely *obratno* 'back'/'again', and can be introduced into syntax with the help of an eventive goal PP. Crucially, I will argue that the stative subevent is not represented in the syntactic decomposition of ditransitive verbs that take just an accusative argument and a dative one.

The Russian adverb *obratno* 'back'/'again' (glossed below simply as obratno), although similar in its meaning to *opjat'*, has different semantics, which involves a return to a state in which an entity had been before (as observed already by Tatevosov 2016). As a consequence, it can modify only descriptions with a target state in the sense of (Kratzer 2000) and allows for restitutive readings only (19).

	- a. **Repetitive**: Available 'Ali Baba opened Sezam, and that had happened before.'
	- b. **Restitutive**: Unavailable 'Ali Baba opened Sezam, and Sezam had been open before.'
	- b. **Restitutive**: Unavailable '… that Ali Baba opened Sezam, and Sezam had been open before.'

<sup>9</sup>The situation is different for English and German, where the pre-object position of repetitive adverbs makes the restitutive reading unavailable, see (i) and (ii).

### Tatiana Bondarenko

	- a. # Rovno exactly odin one student student otkryl opened okno window.acc obratno. obratno 'Exactly one student opened the window again.'
		- i. **Repetitive reading**: Unavailable

'There exists a student that opened the window and had opened it before, and it is not true that other students opened the window and had opened it before.'

(exactly one *x* > again > *x* opened the window > the window was open)

ii. **Restitutive reading**: False

'There exists a student that opened the window and no other student opened the window and the window had been open before.'

(exactly one *x* > *x* opened the window > again > the window was open)

b. Rovno odin student opjat' otkryl okno.

> exactly one student again opened window.acc 'Exactly one student opened the window again.'

i. **Repetitive reading**: True

'There exists a student that opened the window and had opened it before, and it is not true that other students opened the window and had opened it before.'

(exactly one *x* > again > *x* opened the window > the window was open)

ii. **Restitutive reading**: False

'There exists a student that opened the window and no other student opened the window and the window had been open before.'

(exactly one *x* > *x* opened the window > again > the window was open) (adapted from Tatevosov 2016: 31)

Alexiadou et al. (2014) and Lechner et al. (2015) observed that the repetitive and the restitutive readings exhibit different truth conditions in contexts with nonmonotone quantifiers like 'exactly' or 'only one student'. For the context in (19), sentences with subjects that are non-monotone quantifiers are true only under

### 2 Russian datives again: On the (im)possibility of the small clause analysis

the repetitive reading of again, see (19b-i) vs. (19b-ii). While *opjat'* can have repetitive readings and thus (19b) is appropriate in the context provided, *obratno* is illicit in this context because it cannot have repetitive readings.

*Obratno* "looks into" the semantics of a verbal phrase with which it merges and searches for a target state in this semantic representation that it can modify. As the sentence in (20) shows, *obratno* is able to find a target state in the semantic representation of Russian ditransitives.

(20) Maša Masha {otdala gave / otpravila sent / vernula} returned Vase Vasja.dat knigu book.acc obratno. obratno 'Masha gave / sent / returned Vasja the book, and Vasja had had the book before.'

Elaboration of the analysis of properties of Russian *obratno* is beyond the scope of this paper. What is important for us here is that *obratno* can serve as a diagnostic for a stative subevent: it shows us that a result state is present in semantics of ditransitive predicates.<sup>10</sup>

Another piece of evidence that Russian ditransitive verbs have a stative subevent in their semantics comes from the comparison of ditransitive constructions with a dative and an accusative argument with constructions with the same verbs that take an accusative argument and a goal PP. Consider the following two sentences with the verb *otpravlyat'* 'send':


<sup>10</sup>There could be different plausible explanations for the unavailability of repetitive readings with *obratno*. For example, it could be the case that *obratno* is actually not a VP-level adverb but a PP modifier which in some cases signals the presence of a silent PP. Some support in favor of this hypothesis is provided by examples like (i) and (ii), where *obratno* seems to form a constituent with an overtly realized PP (the examples involve a movement of *obratno* + PP – scrambling and wh-movement, respectively):

If *obratno* is a PP modifier, then it follows that it can have exclusively restitutive readings. Under this hypothesis, *obratno* signals the presence of a silent goal PP in (20), which introduces the stative subevent into the syntactic representation that was otherwise not present. I will not pursue this idea here, leaving it for the future research.

### Tatiana Bondarenko

	- a. Available: 'Masha sent Vasja the toy, and that had happened before.'
	- b. Unavailable: 'Masha sent Vasja the toy, and Vasja had had the toy before.'
	- a. Available: 'The manager sent the employee to Moscow, and that had happened before.'
	- b. Available: 'The manager sent the employee to Moscow, and the employee had been in Moscow before.'

When this verb takes an accusative argument and a dative one (21), the restitutive reading of *opjat'* is unavailable. When, however, it takes an accusative argument and a goal PP (22), *opjat'* is able to single out the subevent that denotes the state of the theme argument (the employee) being at the location specified by the goal PP (Moscow).

This difference can also be observed with PPs headed by *k* 'to', which can take animate noun phrases as their complements. Sentences with ditransitive verbs that take a direct object and a *k*-PP, see (24), seem almost synonymous to those with ditransitive verbs that take two objects, see (23); but the restitutive reading is available only in the former construction.

	- b. **Restitutive**: Available 'Masha sent the book to Katja, and Katja had had the book before.'

### 2 Russian datives again: On the (im)possibility of the small clause analysis

If we assume that ditransitive verbs like *otpravljat*' 'send' have uniform semantics across their uses, then it follows that they should have a stative subevent in their semantic representation, since it is visible in some clauses with these verbs.

Why does the presence of a goal PP make the restitutive reading available in sentences with ditransitive verbs? I would like to suggest that the reason for that is that PPs, unlike dative arguments, can be eventive (see McIntyre 2006) and introduce subevents that are present in the semantics of a predicate into the syntactic representation. This difference between dative arguments and goal PPs, as well as the fact that they can co-exist in the same clause, see (25) (cf. English (26)), suggests that PP ditransitives and ditransitives with dative arguments cannot be derivationally related.

	- b. Ja I brosil threw {Vasje Vasja.dat mjač ball.acc / mjač ball.acc Vasje} Vasja.dat v in ruki. hands 'I threw a ball to Vasja, into his hands.'

To sum up, sentences with Russian ditransitive verbs can have restitutive readings in two cases. First, the adverb *obratno* can access a target state in the semantic representation of a verbal phrase. Second, a goal PP can introduce a target state into the syntactic representation, making the restitutive reading available even with the repetitive adverb *opjat'*, which requires a syntactic constituent corresponding to the result state. This suggests that the unavailability of restitutive readings with dative arguments cannot be explained by the absence of a stative subevent in the semantics of Russian ditransitives.

If Russian *opjat'* has the same properties as English *again* and Russian ditransitives have a stative subevent in their event structure, then we have to conclude that for some reason this stative subevent is not represented in syntax. In other words, no small clause (or HaveP/PP/LowApplP) is present in Russian ditransitive sentences with dative arguments. Why is it the case that such a small clause cannot be built? I will first explore a semantic hypothesis: the relevant structure can be built, but cannot be interpreted due to absence of the interpretation Principle R in Russian.

It has been argued (Snyder 2001; Beck & Snyder 2001; Beck 2005) that the interpretation Principle R is not universal: languages differ with respect to whether

### Tatiana Bondarenko

they have a principle allowing to successfully interpret the combination of a verb and a small clause, and this variation is responsible for the (un)availability of a number of constructions, including resultatives, verb-particle constructions, *put*locative constructions, *make*-causative constructions and the double object construction, among others. Could it be the case that Russian is one of the languages that do not have the Principle R?

This hypothesis is dubious, since Russian seems to require some version of this principle independently for interpreting other constructions.<sup>11</sup> One example of a case where such a principle would be needed is sentences with verbs that take lexical prefixes.

(27) Vasja Vasja za-brosil pvb-throw mjač ball v in vorota. goal 'Vasja threw the ball into the goal.'

Svenonius (2004) has proposed that lexical prefixes in Russian, such as *za* in (27), enter the derivation as heads of small clauses that are complements of verbs. Under this view, lexical prefixes head their own projections and take PPs as their complements and direct objects as their subjects (Figure 2).

Figure 2: Lexical prefixes as heads of small clauses

This analysis receives additional support from the fact that *opjat'* can have the restitutive reading in sentences with verbs with lexical prefixes. Consider (28):

<sup>11</sup>As an anonymous reviewer points out, Russian does have resultative constructions. For example, one type of Russian resultatives is discussed in Tatevosov (2010). I am grateful to the anonymous reviewer for this observation, which provides an additional argument against the inaccessibility of Principle R in Russian.

### 2 Russian datives again: On the (im)possibility of the small clause analysis

(28) *Context:* This ball was lying inside the goal for as long as we can remember. For the first time someone threw the ball out of the goal. But five minutes later…

Vasja Vasja opjat' again za-brosil pvb-throw mjač ball v in vorota. goal 'Vasja threw the ball into the goal, and the ball had been in the goal before.'

*Opjat'* in (28) has the interpretation under which an event that has occurred before is the event of the ball being inside the goal. Under the syntactic approach to the ambiguity of again, this suggests that there is a syntactic constituent – a small clause, which represents the stative subevent of the predicate and to which *opjat'* can attach (Figure 3).

Figure 3: The small clause analysis of Russian *zabrosit'* 'throw'

If Russian did not have means of interpreting the combination of a verb and a small clause (the Principle R or its equivalent), then the sentence in (28) should be uninterpretable and thus lead to a derivation crash. This implies that uninterpretability cannot be the problem that prevents building a small clause structure for sentences with ditransitive verbs in Russian.

This brings us to the conclusion that ditransitive sentences with dative arguments in Russian do not contain a small clause for syntactic reasons: the structure with SC/HaveP/LowApplP/particular kinds of null P/R cannot be built. As a consequence, under our assumption that the availability of the restitutive read-

### Tatiana Bondarenko

ing entails lexical decomposition in syntax,<sup>12</sup> the syntax of ditransitive clauses in Russian significantly differs from the syntax of similar sentences in English. If English might decompose *give* syntactically as CAUSE to HAVE, this sort of decomposition does not take place in Russian. A more general consequence follows from this difference between the two languages: the lexical decomposition for a given predicate cannot be universal; languages differ with respect to how they map event structures of similar predicates onto syntactic representations.

### **4 Restitutive readings with Russian datives: Higher datives**

Dative arguments can differ with respect to how they are related to a result state of a given predicate. In this section I will show that restitutive readings of *opjat'* are available in sentences with higher, non-subcategorized dative arguments, but that in these clauses dative noun phrases do not denote participants of stative subevents singled out by *opjat'*.

Clauses with non-subcategorized dative arguments and predicates like *otkryt*' *dver*' 'open the door' do not exhibit the restitutive reading when dative arguments follow the verb (29), but are able to escape the scope of again when they are scrambled to the left of it, in which case the restitutive reading becomes available (30):

	- a. **Repetitive**: Available 'Vasja opened the door for Masha, that had happened before.'
	- b. **Restitutive**: Unavailable 'Vasja opened the door for Masha, the door had been open before.'
	- a. **Repetitive**: Available 'Vasja opened the door for Masha, and that had happened before.'
	- b. **Restitutive**: Available 'Vasja opened the door for Masha, and the door had been open before.'

<sup>12</sup>An anonymous reviewer reasonably points out that that this assumption is not shared by everyone working on double object constructions. The conclusions that I argue for in this paper follow only if this assumption is retained.

### 2 Russian datives again: On the (im)possibility of the small clause analysis

As can be seen from the restitutive reading of (30), the dative argument is not interpreted as a participant of the stative subevent of the predicate *otkryt*' *dver*' 'open the door'. The interpretation in (30b) states that Vasja did some activity for Masha that resulted in the repeated state of the door being open. This suggests that non-subcategorized datives are introduced higher than the syntactically represented stative subevents.

Note that scrambling of dative arguments to the left of *opjat'* in ditransitive sentences does not feed the restitutive reading:

(31) *Context:* Vasja had always had the book *Two captains* by Kaverin; he had never given it to anyone. One day he accidentally left the book at Masha's place…

# I and togda then Maša Masha Vase Vasja.dat opjat' again {otdala gave / otpravila sent / vernula} returned knigu. book.acc Intended: 'And then Masha gave / sent / returned Vasja the book, and Vasja had had the book before.'

This means that stative subevents are not represented in the syntax of ditransitives with dative arguments. If they were present in the syntactic representation, they could be singled out at least in cases when datives are scrambled.

The fact that the restitutive reading of *opjat'* is available in sentences with non-subcategorized datives, in contrast to ditransitive sentences with datives, is concordant with the proposal that non-subcategorized dative arguments are introduced higher than VPs (Boneh & Nash 2017). One piece of evidence for this comes from the fact that sentences with non-subcategorized datives show asymmetrical binding: only the dative argument can bind the accusative one, but not the other way around:

	- b. Šaman shaman zakoldoval jinxed oxotnikam hunters.dat drug each druga. other.acc
	- c. \* Šaman shaman zakoldoval jinxed drug each drugu other.dat oxotnikov. hunters.acc
	- d. ⁇ Šaman shaman zakoldoval jinxed drug each druga other.acc oxotnikam. hunters.dat

(Intended:) 'The shaman jinxed the hunters for each other.'

(Boneh & Nash 2017)

### Tatiana Bondarenko

It can be shown that evidence from binding and from the scope of *opjat'* go hand in hand: sentences with non-subcategorized datives, in which the dative argument asymmetrically binds the direct object, exhibit restitutive readings when the dative argument is scrambled outside the scope of *opjat'*:

(33) *Context:* Two hunters have been born jinxed and have been this way for a long time. One day a good witch relieved them from the jinx. But after some time, they had a huge fight and were very angry with each other. Each of them came to the shaman to ask him to jinx the other one.

Šaman shaman oxotnikam hunters.dat opjat' again zakoldoval jinxed drug each druga other.acc 'Shaman jinxed the hunters for each other, and the hunters had been jinxed before (but the shaman had never jinxed them before).'

Thus, non-subcategorized datives are introduced higher than VPs and cannot be understood as participants of stative subevents of predicates. But if a predicate has a stative subevent, it can be successfully singled out by *opjat'* in case the dative argument is scrambled to the left of the repetitive adverb.

### **5 Restitutive readings with Russian datives: Locative applicatives**

In the previous section I have discussed a case of the restitutive reading in structures with a dative argument which was not a participant in the stative subevent singled out by *opjat'*. In this section I will show that Russian also has a construction in which a dative argument is a participant of the stative subevent detected by the restitutive *opjat'*.

The construction under consideration, which I will call the locative applicative construction ("N-applicatives" in the terminology of Pshekhotskaya 2012), usually involves a motion verb that takes a direct object, a goal PP and an optional dative argument:

(34) Maša Masha opjat' again položila put knigu book.acc Vase Vasja.dat na on stol. table a. **Repetitive**: Available 'Masha put the book on the table for Vasja, and that had happened before.'

	- b. **Restitutive**: Available

'Masha put the book on the table for Vasja, and Vasja had had the book on the table before.'

In (34) the dative argument is interpreted as a possessor of the small clause that represents the stative subevent "the book is on the table": Vasja's having the book on the table is being repeated.

The locative applicative construction is not found exclusively with motion verbs, it is also sometimes possible with lexical causatives (35) and change-ofstate predicates (36).

	- a. **Repetitive**: Available 'Vasja seated the daughter on the chair for Masha, and that had happened before.'
	- b. **Restitutive**: Available 'Vasja seated the daughter on the chair for Masha, and Masha had had the daughter sit on the chair before.'
	- a. **Repetitive**: Available 'Masha whitened the wall in the room for the mother, and that had happened before.'
	- b. **Restitutive**: Available

'Masha whitened the wall in the room for the mother, and the mother had had the wall white in the room before.'

The dative argument in this structure is merged lower than the direct object, as the evidence from binding suggests: the dative reciprocal can be bound by the direct object, but the accusative reciprocal cannot be bound by the dative argument:

(37) a. Vasja Vasja posadil seated devoček girls.acc drug each drugu other.dat na on stulja. chairs 'Vasja seated the girls – A and B – in such a way that A has B sitting on A's chair and B has A sitting on B's chair.' (Literally: Vasja seated the girls*<sup>i</sup>* to each other*<sup>i</sup>* on the chairs.)

### Tatiana Bondarenko

b. \* Vasja Vasja posadil seated drug each druga other.acc devočkam girls.dat na on stulja. chairs Intended: 'Vasja seated the girls – A and B – in such a way that A has B sitting on A's chair and B has A sitting on B's chair.' (Literally: Vasja seated each other*<sup>i</sup>* to the girls*<sup>i</sup>* on the chairs.)

The example in (38) shows that the dative reciprocal that is bound by the direct object can be a participant of the stative subevent identified by *opjat'*:

	- a. **Repetitive**: Available

'Vasja seated the girls – A and B – in such a way that A has B sitting on A's chair and B has A sitting on B's chair, and that had happened before.'

(Literally: Vasja seated girls*<sup>i</sup>* to each other*<sup>i</sup>* on the chairs, and that had happened before.)

b. **Restitutive**: Available

'Vasja seated the girls – A and B – in such a way that A has B sitting on A's chair and B has A sitting on B's chair, and there was a situation before where A had B sitting on A's chair, and B had A sitting on B's chair.' (Literally: Vasja seated girls*<sup>i</sup>* to each other*<sup>i</sup>* on the chairs, and the girls*<sup>i</sup>* had sat by each other*<sup>i</sup>* on the chairs before.)

It can also be demonstrated that the dative argument forms a constituent with the locative phrase. When a dative argument is a wh-word, it can pied-pipe the prepositional phrase to the left periphery:

	- b. [Komu who.dat na on stul] chair Vasja Vasja posadil seated devočku? girl.acc 'Which person *x* is such that Vasja seated a girl for *x* on *x*'s chair?'
	- c. [Komu who.dat v in školu] school Maša Masha otdala gave syna? son.acc 'Which person *x* is such that Masha gave her son to *x*, to *x*'s school?'

### 2 Russian datives again: On the (im)possibility of the small clause analysis

I would like to propose that in the locative applicative construction the dative noun phrase is an applicative argument that is introduced on top of the PP that introduces a stative subevent into the syntactic representation. Since applicative heads introduce an abstract HAVE relation between the applied argument and the complement of Appl (Cuervo 2003; McIntyre 2006; among others), the fact that the dative argument in Russian locative applicatives is interpreted as a holder of the state that the PP denotes is expected if the dative argument is applied to an eventive PP; see (40) and Figure 4. 13

	- a. **Repetitive**: Available 'Vasja hung the picture for Katja on the wall, and that had happened before.'
	- b. **Restitutive**: Available 'Vasja hung the picture for Katja on the wall, and Katja had the picture on the wall before.'

The restitutive reading of *opjat'* in this construction arises when *opjat'* attaches to an applicative phrase (Figure 4) and takes scope over the stative subevent denoted by a goal PP. The dative argument falls inside the scope of *opjat'* since it is an applied argument of an eventive PP and not an argument of the verb.

### **6 Conclusions**

In this paper I have argued against the small clause analysis of Russian ditransitives. I have observed that although Russian repetitive adverb *opjat'* has the same ability to look inside the decomposition structure as English *again*, it cannot have the restitutive reading in clauses with ditransitive verbs that take two objects, in contrast to *again* in the English double object construction. I have shown that Russian ditransitives have stative subevents in their semantics and that the unavailability of a small clause structure for Russian ditransitives cannot be explained by a semantic restriction, since the Principle R or its equivalent that

<sup>13</sup>The structure in Figure 4 feeds the relevant (restitutive) interpretation. In order to derive the attested word order, cf. (40), I assume that later in the derivation the lexical verb *povesil* 'hung' undergoes further movement to Asp (see Harizanov & Gribanova 2018 for discussion), and the repetitive adverb *opjat'* moves to a position before the verb (the arguments for a movement analysis of repetitives that were proposed in Xu 2016 for Chinese hold for Russian as well), with subsequent reconstruction into its base position at LF.

### Tatiana Bondarenko

Figure 4: The locative applicative construction (40)

allows to interpret a combination of a verb and a small clause is independently required for other constructions of Russian. I have concluded that the small clause structure is not present in Russian ditransitives due to syntactic reasons: the syntax cannot build such a structure. The unavailability of the restitutive reading in Russian ditransitives suggests that they are not equivalent to the English double object construction or the *to-*PP construction. They also cannot be analyzed as involving a silent (incorporated) P, since the structure with a PP would make the restitutive reading available. Although the new empirical data discussed in this paper is compatible with several analyses of ditransitives (for example, with applicative analysis (Bruening 2010) or non-derivational analysis along the lines of (Boneh & Nash 2017) and does not settle on a particular one, it clearly shows that Russian ditransitives do not involve a small clause structure and differ from English ditransitives significantly.

I have also examined two other constructions with dative arguments in Russian, both of which allow for the restitutive reading of *opjat'*. In sentences with "high" datives the restitutive reading is available if the dative argument escapes the scope of *opjat'*. The dative does not denote a participant of the stative subevent in this case, which means that it cannot be introduced into the structure

### 2 Russian datives again: On the (im)possibility of the small clause analysis

lower than the first subevent of the predicate. In the locative applicative construction, the dative argument is a participant of the subevent introduced by a PP and is inside the scope of the restitutive *opjat'*. I have argued that in this construction the dative is an applied argument to the PP, and therefore is always lower than the direct object, forms a constituent with the PP and can be inside the scope of *opjat'* under the restitutive reading.

### **Abbreviations**


### **Acknowledgments**

Many thanks to Sergei Tatevosov for his feedback and to the audience of the FDSL12 conference.

### **References**


### Tatiana Bondarenko


2 Russian datives again: On the (im)possibility of the small clause analysis


### **Chapter 3**

## **Imperfective past passive participles in Russian**

### Olga Borik

The National Distance Education University (UNED)

### Berit Gehrke

Humboldt-Universität zu Berlin

Contra the received view that Russian past passive participles (PPPs) can only be derived from perfective verb forms, we show that imperfective (IPF) PPPs can be found in corpora as well. A substantial subset of these should receive a compositional analysis, given that they can be used in periphrastic passive constructions with predictable meaning contribution. However, these IPF PPPs commonly require a modifier and occur with a particular information structure, often accompanied by a marked word order, where the event described by the PPP is backgrounded (occurs first) and focus is on the modifier (appearing somewhere after the PPP). We propose an analysis, under which such uses of the IPF are parallel to definite descriptions, in the sense that the IPF signals an anaphoric link to a previously introduced or inferable eventive discourse referent, and the modifier provides new information about this event.

**Keywords:** presuppositional imperfective, passive, past passive participle, Russian

### **1 Introduction**

In Russian, as in other Slavic languages, there are two types of passives. The reflexive passive is formed by the reflexive marker/postfix *-sja*, whereas the periphrastic passive combines a past passive participle (PPP) with a form of *byt'* 'be'. It is generally assumed for Russian (but not necessarily for other Slavic languages; see §4) that the two types of passives are aspectually restricted (e.g.,

Olga Borik & Berit Gehrke. 2018. Imperfective past passive participles in Russian. In Denisa Lenertová, Roland Meyer, Radek Šimík & Luka Szucsich (eds.), *Advances in formal Slavic linguistics 2016*, 53–76. Berlin: Language Science Press. DOI:10.5281/zenodo.2545513

### Olga Borik & Berit Gehrke

Babby & Brecht 1975), in the sense that imperfectives only appear in reflexive (1), perfectives only in periphrastic passives (2).

	- b. Vorota gates.nom otkryvalis' opened.ipf.rfl storožem. watchman.instr 'The gate was (being) opened by a/the watchman.'
	- c. \* Vorota gates.nom byli were otkryvany opened.ipf.ppp storožem. watchman.instr
	- b. Vorota gates.nom byli were otkryty opened.pf.ppp storožem. watchman.instr 'The gate was opened by a/the watchman.'
	- c. \* Vorota gates.nom otkrylis' opened.pf.rfl storožem. watchman.instr

In this paper, we show that this is an oversimplified view. In particular, we address the occurrence of imperfective PPPs in Russian periphrastic passives, such as (3), which, according to the generalization exemplified above should either not exist at all or be at most exceptional.<sup>1</sup>

(3) Oni they byli were šity sewn.ipf kornjami roots.instr berezy birch.gen ili or vereska heather.gen i and byli were očen' very krepki. tough 'They were sewn with birch or heather roots and were very tough.'

From a purely morphological perspective, and also from a cross-Slavic perspective, nothing is wrong with imperfective PPPs per se. While (4) shows that PPPs are regularly derived from perfective verbs, we can see in (5) that imperfective ones exist as well.<sup>2</sup>

<sup>1</sup>There are also possibly exceptional examples for reflexive passives of perfective verbs; see, e.g., Schoorlemmer (1995) and Fehrmann et al. (2010) for relevant examples.

<sup>2</sup> In this paper we set aside long form PPPs and focus on short form PPPs only, such as those in (4) and (5), since these are the ones used in passives (see Borik 2014 for further discussion).

### 3 Imperfective past passive participles in Russian

	- b. *rasserdit'* 'make.angry.pf' > *rasseržen* 'made.angry.pf
	- c. *zakryt'* 'close.pf' > *zakryt* 'closed.pf'
	- b. *slyšat'* 'hear.ipf' > *slyšan* 'heard.ipf'
	- c. *krasit'* 'paint.ipf' > *krašen* 'painted.ipf'

Nevertheless, the received view is that imperfective PPPs like those in (3) and in (5) are rare, idiomatic or frozen forms that function like adjectives (e.g. Švedova 1980; Schoorlemmer 1995). A common strategy in the discussion of periphrastic passives in Russian is therefore to completely ignore such participles (Babby & Brecht 1975; Paslawska & von Stechow 2003). A non-standard and somewhat more refined view, and one that we share, is found in Knjazev (2007), who notes that imperfective PPPs are somehow restricted in use, in comparison to more "regular" perfective ones. However, he does not give a formal account of their semantics, nor a detailed description of when and why such participles appear.

Our goal in this paper is to show, based on naturally occurring data in a corpus, that imperfective past passive participles are indeed participles, not only by name and by their morphology, but also by their distribution. We show that they can be participles, not adjectives, based on their predictable compositional semantics, as well as their occurrence in regular periphrastic passive constructions, both verbal and adjectival. We argue that a subgroup of such participles constitutes a case of the presuppositional imperfective (in the sense of Grønn 2003), a subtype of the so-called general-factual imperfective, which expresses the sheer fact that an/the event took place.

Among the readings generally associated with the imperfective aspect in Russian, the general-factual reading, which we will have more to say about in §2.3, is the most well-studied one. It is usually characterized as a non-canonical reading, in which the imperfective aspect is in "aspectual competition" with the perfective aspect (a term that goes back to at least Mathesius 1938). Canonical imperfective meanings that in Russian are expressed almost exclusively by imperfective forms are process and habitual readings.

As a side note we want to emphasize that we reserve the terms (im)perfective for morphological forms of a given verb, regardless of the semantics associated with such forms in a given context. In particular, we study imperfective forms used in contexts that might semantically be called perfective, namely completed bounded events in the past.

The paper is structured as follows. §2 outlines the empirical generalization from our corpus study and establishes that imperfective PPPs appear in regular

### Olga Borik & Berit Gehrke

periphrastic passives. We also show that the imperfective contexts that such participles are found in express non-canonical imperfective meanings, and we hypothesize that they always involve either the existential or the presuppositional subtype of the general-factual imperfective. §3 provides an analysis of presuppositional imperfective PPPs and provides further arguments in favour of such an analysis. Finally, §4 concludes and gives an outlook on further research questions and open issues.

### **2 The data**

We extracted data from the Russian National Corpus (RNC)<sup>3</sup> of 109,028 documents, which contained 22,209,999 sentences and 265,401,717 words. Based on the grammatical features partcp,praet,pass,ipf, we focused on imperfective past passive participles directly preceding or following a finite form of *byt'* 'be' (BE). Respectively, we found 2,632 and 17,015 contexts, and this reflects the unmarked word order status of BE preceding the participle. Our search thus excludes participles with non-finite or a null form of BE (i.e. present tense), participles as second conjuncts in coordination with, e.g., other participles, etc. Since we used the non-disambiguated corpus version, we manually excluded biaspectual forms, which are marked as imperfective in the RNC, such as *obeščan* 'promised', *velen* 'ordered', and verbs in *-ovat'* (e.g. *ispol'zovan* 'used', *realizovan* 'realized'). We furthermore excluded all long form participles, given that only short form participles canonically appear in Russian periphrastic passive constructions. Finally, we excluded errors in tagging, such as *Sezan* (the French painter Cézanne), *strašen* 'terrible/scary.adj' (tagged as a participle), or perfective participles erroneously tagged as imperfective (e.g. *otvečen* 'answered.pf'). Given these limitations, we will not provide a quantitative analysis.

In the following, we will show that imperfective PPPs are not limited to idiomatic expressions, but that we find regular, repeated forms with predictable compositional meaning (§2.1) that occur in both adjectival and verbal passives (§2.2). We will therefore conclude that such participles (both adjectival and verbal ones) need to be accounted for, uniformly, and not just discarded as exceptions.<sup>4</sup>

<sup>3</sup>http://ruscorpora.ru/

<sup>4</sup>A reviewer points out that our data sound archaic. However, we carefully separated all the truly archaic examples (e.g., 17th–18th century and before); only one of those appears in the paper, in (10), and we state explicitly that this is an archaic example. All the other examples here are mostly from literary sources from the 1950s–60s, so they cannot be classified as 'archaic'. We think that the reviewer might not be used to these kinds of examples because they are not part of the literary norm.

### 3 Imperfective past passive participles in Russian

In §2.3 we will conjecture that imperfective PPPs always involve the generalfactual meaning of the imperfective aspect.

### **2.1 Non-idiomatic, regular imperfective past passive participles**

A first research question was to see whether the wideheld assumption, briefly outlined in §1, according to which all imperfective PPPs are idiomatic or frozen forms that should be analyzed as adjectives, withstands closer data scrutiny. Of course we found idiomatic participles, such as the idiom *ne lykom šit*, which is literally 'not sewn with bast fiber' but means 'not simple(-minded)'.There are also fixed expressions, such as *rožden/kreščen* 'born/baptized', and genuine adjectives, such as *viden*, literally 'seen' but actually meaning 'visible'.

However, we found a number of regular, repeated forms with predictable meaning. A non-exhaustive list of such participles is given in (6).

(6) *pisan* 'written.ipf', *čitan* 'read.ipf', *pit* 'drunk.ipf', *eden* 'eaten.ipf', *delan* 'made.ipf', *šit* 'sewn.ipf', *čekanen* 'minted.ipf', *bit* 'beaten.ipf', *strižen* 'haircut.ipf', *myt* 'washed.ipf', *brit* 'shaved.ipf', *kormlen* 'fed.ipf', *nesen* 'carried.ipf', *govoren* 'said.ipf', *prošen* 'asked.ipf', *zvan* 'called.ipf', *kusan* 'bitten.ipf', *kryt* 'covered.ipf', *njuxan* 'smelled.ipf'

We take these forms to be regular because we found various occurrences (tokens) of a given participle (type), in combination with different types of arguments. We furthermore take them to be compositional because we could not detect any idiomatic or idiosyncratic meaning in the contexts we found them in, when compared to the base verbs they are derived from. In particular, their meaning is composed of the meaning of the underlying verb and the meaning of the past passive participle (under any account of such participles; see §2.2 for further discussion).

To get a first impression of the data, some relevant examples in context are given in (7–9), which we leave uncommented at this moment but will come back to in later discussion.

(7) V in silu power delikatnosti delicacy.gen situacii situation.gen gosti guests zvany called.ipf byli were s with osobym particular razborom. selection 'Due to a delicate situation the guests were invited upon careful selection.'

### Olga Borik & Berit Gehrke


As (6–9) show, compositional imperfective past passive participles are not limited to one particular verb class. Nevertheless, our manual check reveals that they are often formed from verbs of saying ('say', 'ask', etc.) and incremental verbs ('write', 'sew', etc.), though not exclusively. This suggests that there might still be lexical restrictions, but this could also be due to limitations of the corpus. In §4 we speculate why this might be the case.

We furthermore found no contemporary participles derived from secondary imperfectives. The ones we did find are all archaic, i.e. at least from before the 19th century, such as the biblical (10).

(10) V in leto summer 7010 7010 mesjaca month.gen avgusta august.gen v in šestoe sixth na on Preobraženie transfiguration Gospoda lord.gen našego our.gen Iisusa Jesus.gen Xrista Christ.gen načata begun.pf byst' be.aor podpisyvana signed.si cerkov' church […] 'In the summer of 7010 on August 6th, on the day of the transfiguration of

our Lord Jesus Christ they begun to decorate the walls of the church (lit.: the church was begun to be painted).'

We therefore conclude for now that PPPs formed from secondary imperfectives are at most extremely rare, and in §4 we will provide some informal discussion as to why this may be.

To sum up, there are clearly compositional imperfective PPPs, which cannot simply be discarded as exceptional but need to be accounted for. Let us then turn to the kinds of passives that imperfective PPPs occur in.

### **2.2 Imperfective past passive participles in periphrastic passives**

In this section we address the question whether imperfective PPPs can be found in all kinds of passives. For example, if there were only adjectival participles, proponents of a lexical approach to such participles could still maintain that they are

### 3 Imperfective past passive participles in Russian

adjectives, not related to imperfective verbs. This would then still be in line with the widespread assumption that there are no imperfective PPPs in periphrastic passives, which are then always verbal. It should be noted, however, that we do not take adjectival participles to be non-decomposable adjectives, so ultimately we would want to provide a compositional account that also covers adjectival participles.

Let us give some general background on verbal vs. adjectival passives. We follow the, by now, standard assumption that adjectival participles involve adjectivization and combine with a copula, whereas verbal participles 'stay' verbal and combine with an auxiliary. For languages like English, German, and Spanish, it has been argued (see Gehrke 2011; 2015; Gehrke & Marco 2014; Alexiadou et al. 2014: and literature cited therein) that unlike with verbal passives, the underlying event in adjectival passives lacks spatiotemporal location or referential event participants, and only the state associated with the adjectival participle can be located temporally. Therefore, spatiotemporal event modifiers, referential by- /with-phrases, and similar such expressions that need to access an actual event, can only appear with verbal participles. In (11), this contrast is illustrated with examples from German, which makes a formal distinction between verbal and adjectival passives: the former appear with the auxiliary *werden* 'become' and the latter with the copula *sein* 'be'.<sup>5</sup>

	- b. Der the Computer computer ist is vor before drei three Tagen days repariert repaired #(worden). become.ppp 'The computer {#is / has been ∼ was (being)} repaired three days ago.'

The modifiers in (11) relate to a spatiotemporally located event token with referential event participants, and we assume, following the above-mentioned literature, that only verbal participles make available such an event token. In contrast, non-referential by-phrases, (12a), and manner modifiers, (12b), which, we assume, derive an event subkind, are acceptable with adjectival participles.

<sup>5</sup>These and the following German examples are based on examples discussed in Gehrke (2015) and literature cited therein.

### Olga Borik & Berit Gehrke

	- b. Das the Haar hair war was / wurde became ziemlich rather schlampig slopp(il)y gekämmt. combed 'The hair was (being) combed in a rather sloppy way.'

Finally, since adjectival passives always make available a state, any state-related modification is acceptable as well (see op.cit. for examples).

For Russian, we follow Schoorlemmer (1995) and Borik (2013; 2014) in taking short form perfective PPPs to be either verbal or adjectival; in principle, this should also hold for imperfective ones. We take the same modifier restrictions illustrated for German in (11–12) to hold for Russian adjectival participles, even if we cannot see from the form of BE alone whether we are dealing with an adjectival or a verbal participle. For example, the temporal modifier in (13) (discussed in Borik 2014, after an example from Paslawska & von Stechow 2003) does not locate the state associated with the participle but the underlying event, and therefore, irrespective of the presence/absence of BE, we have to be dealing with a verbal participle that makes available an event token for modification.

(13) Dom house.nom (byl) was postroen built.pf v in prošlom last godu. year 'The house was built last year.'

Thus, if we find such event-related modifiers in our data with imperfective PPPs, we can take these to be verbal. This would then refute (or at least seriously jeopardize) the claim that they can appear only in adjectival passives.

As the examples in (14) show, we indeed found imperfective PPPs co-occurring with such event-related modifiers, highlighted in boldface. In (14a) we find a temporal modifier that locates the underlying event. (14a–14c) contain by-phrases (in Russian: instrumental-marked nominals), which are referential, since they contain a proper name, a personal pronoun, and an (inherently definite) possessive pronoun, respectively. In (14d) we have a definite spatial expression locating the underlying event.

(14) a. Pisano written.ipf ėto that bylo was **Dostoevskim** Dostoevskij.instr **v 1871 godu** in 1871 year […] 'That was written by Dostoevskij in 1871.'

### 3 Imperfective past passive participles in Russian


We thus conclude that imperfective PPPs can appear in unambiguously verbal passives and can therefore not be reduced to adjectives.

On the other hand, it is also not the case that all imperfective PPPs are verbal. The following two examples illustrate adjectival PPPs: (15a) involves a nonreferential instrumental case-marked NP that characterizes the state that the house is in,<sup>6</sup> and the adverbial manner modifier in (15b) can only describe a resulting haircut 'style', but not the process of cutting hair.

	- b. My we oba both byli were striženy haircut.ipf **nagolo** bald […] 'We were both shorn / we both had shaven heads.'

We therefore conclude this section by stating that imperfective PPPs appear in both verbal and adjectival passives in Russian, and that their distribution is not limited to a specific passive construction. In the next section, we turn to the meaning expressed in such passives, namely the general-factual meaning of the imperfective aspect.

### **2.3 General-factual imperfective past passive participles**

In this section, we discuss the imperfective contexts that the participles in question appear in. We could corroborate Knjazev's (2007) generalization that they

<sup>6</sup>We take 'cover' here to be used as a stative extent predicate, rather than an eventive changeof-state predicate; see Gawron (2009).

### Olga Borik & Berit Gehrke

are found in non-progressive imperfective contexts only. In particular, we hypothesize that all the examples with imperfective PPPs that we found can be analyzed as one or the other type of the general-factual meaning of the imperfective. In the following, we give a brief introduction to this kind of reading.

### **2.3.1 The general-factual meaning of the Russian imperfective**

The term general-factual (*obščefaktičeskoe*) goes back to Maslov (1959) (for recent discussion see Mehlig 2016). While this is a well-discussed imperfective meaning, there is no real consensus in the literature (see Grønn 2003: chapter 4 for an overview and references) as to the precise empirical delineation of this meaning, the question whether or not there are subtypes and if there are, how many, or the theoretical account: Is this an imperfective meaning in its own right, or is it a subtype of core imperfective meanings (i.e. process or iterative/habitual)? What most authors agree on, however, is that factual imperfectives are in aspectual competition with their perfective counterparts, in the sense that in many such contexts the imperfective can be replaced by the perfective, with only subtle meaning differences. In particular, if we are to find a meaning difference at all, it has nothing to do with, e.g., a completed event for the PF and an incompleted one for the IPF. We illustrate this with some of Padučeva's (1996) classical general-factual examples in (16), and their perfective counterparts in (17).

	- b. Gde where apel'siny oranges.acc pokupali? bought.ipf.pl 'Where did they/you buy the(se) oranges?'
	- b. Gde where apel'siny oranges.acc kupili? bought.pf.pl 'Where did they/you buy the(se) oranges?'

In both these examples, we are dealing with one-time completed events in the past (cleaning the room and buying oranges), no matter whether the IPF or the PF is used.

### 3 Imperfective past passive participles in Russian

Grønn (2003) discerns two subtypes of the general-factual meaning: existential and presuppositional. <sup>7</sup> Existential imperfectives often (but not always) have intonational focus on the verb and are incompatible with precise temporal expressions locating an event. Thus, if we find temporal modifiers at all, these have to be rather vague, or they are temporal frame adverbials specifying a larger interval within which a (series of) event(s) happened (at some point in time or other). There are also contexts which actually require existential imperfectives, such as the epistemically indefinite *kogda-nibud'* 'ever' in (18).

(18) Ty you kogda-nibud' ever {pročityval read.si /#pročital read.pf / čital} read.ipf roman novel Prusta Proust.gen do until konca? end 'Have you ever read a novel by Proust to the end?' (Grønn 2003: 73)

Since we will mostly focus on the other type of factual meaning, the presuppositional one, we will not discuss theoretical accounts of existential imperfectives here. Informally this reading can be characterized as 'there was (at least) one event of that type', or, under negation, 'there was no (∼ never any) event of that type' (see Mehlig 2001; 2013; Mueller-Reichau 2013; 2015; Mueller-Reichau & Gehrke 2015). We follow a more general assumption in the literature that the use of existential imperfectives is due to the non-uniqueness, or temporal indefiniteness / non-specificity of the event; when this is marked explicitly, e.g. by *kogda-nibud'* in (18), the use of the perfective becomes impossible (see op.cit. for further discussion).

Presuppositional imperfectives, in turn, come with a different information structure: The verb is never accentuated, and focus is on some other constituent in the sentence. This imperfective use is found in the examples in (16) and is furthermore illustrated by the boldfaced verb form in (19), where focus is on the clefted pronoun *ty* 'you' (focus is marked by subscript F).

(19) Anna Anna otkrovenno openly brosila threw.pf emu him v in lico face obvinenie: accusation ėto that ty<sup>F</sup> you **ubival** killed.ipf ix, them a and ispol'zoval used.(i)pf dlja for ėtogo that menja! me 'Anna openly accused him: It was you who killed them, and you used me to achieve your goal!' (after Grønn 2003: 131)

<sup>7</sup>These roughly correspond to Padučeva's (1996) existential/concrete general-factual vs. actional distinction.

### Olga Borik & Berit Gehrke

The second sentence in (20) (attributed to Forsyth 1970) is another case of the presuppositional imperfective, as discussed in Grønn (2003: 192f.). The first sentence introduces the completed past event 'write my first love letter' with a perfective verb form (*napisal*). The second sentence is still about this very same event, picked up by the imperfective 'write'; the event, however, is backgrounded and the intonational focus is on the modifier *karandašom* 'with pencil'.

(20) V in ėtoj this porternoj tavern ja I […] napisal wrote.pf pervoe first ljubovnoe love pis'mo. letter **Pisal** wrote.ipf karandašomF. pencil.instr 'In this tavern, I wrote my first love letter. I wrote it with a pencil.'

Grønn assumes that at the VP level this information structure leads to a background–focus division (in the sense of Krifka 2001). Backgrounded material is argued to be transformed into a presupposition, following The Background/Presupposition Rule in Geurts & van der Sandt (1997). Grønn's DRT formalization of the semantics of the VP in this second sentence in (20), after application of the Background/Presupposition Rule, is given in (21) (Grønn 2003: 193).<sup>8</sup>

(21) <sup>J</sup>VP<sup>K</sup> <sup>=</sup> *λe*[*<sup>x</sup>* <sup>|</sup> INSTRUMENT(*e*, *<sup>x</sup>*), pencil(*x*)] [ | write(*e*)]

The subscripted part of (21) is argued to introduce presupposed content into the DRS: the writing event is in the background and thus presupposed, whereas 'with pencil' is in focus and part of the assertoric content. According to Grønn (2003: 192), "the verbal predicate has an eventive argument, an instantiation of which is presupposed, i.e. given (more or less entailed) in the input context". Presuppositions are treated as anaphora, which can be bound to an antecedent, e.g. the

<sup>8</sup> Instead of the probably more familiar box notation for DRSs, Grønn employs a linear simplified notation: To the left of | are the discourse referents one normally finds at the top of a DRS box (*x* in (21)) and to the right of it are the conditions on such discourse referents, separated by commata (for further discussion see Grønn 2003: 43).

The VP in (21) is further embedded under AspP. Grønn (2003) argues for an underspecified meaning of the imperfective, with the event time overlapping the reference time (building on Klein 1995). He assumes that this meaning can be strengthened, in the right context, to the kind of perfective meaning we get with factual IPFs. In a more recent paper, Grønn (2015) refrains from giving the Russian IPF a uniform denotation, and factual IPFs are argued to have the same denotation as PFs (the event time is included in the reference time). For the full formalization of this example, which also takes into account the contribution of Aspect, Tense and the overall discourse, see op.cit.

### 3 Imperfective past passive participles in Russian

perfective *napisal* in the first sentence in (20), or justified by the input context, as in (22).

(22) Dlja bol'šinstva znakomyx vaš [**ot"ezd**](pseudo-)antecedent stalPF polnoj neožidannost'ju…Vy [**uezžali**IPF] anaphora v Ameriku [ot čego-to, k čemu-to ili že prosto voznamerilis'PF spokojno provestiPF tam buduščuju starost'] <sup>F</sup>? 'For most of your friends your departure to America came as a total

surprise … Did you leave for America for a particular reason or with a certain goal, or did you simply decide to spend your retirement calmly over there?' (Grønn 2003: 207f.)

The nominalization *vaš ot"ezd* 'your departure' (lit. 'off-drival') in the first sentence of (22) introduces a (one-time, completed) departure event by the addressee. This event is picked up again by the imperfective verb form *uezžali* 'away-drove' (lit.), which contains a semantically related prefix and the same verbal root ('drive'). In this second sentence, the departure event is backgrounded with respect to the focused elements that inquire about the reason or purpose of the departure.

Returning to imperfective PPPs, a crucial indication that they express a (subtype of the) general-factual imperfective meaning is the following. Recall from the beginning of §2.3.1 that it holds for the general-factual meaning more generally that (in most cases) both imperfective and perfective word forms can be used, with only subtle meaning differences. When we compare our imperfective participles with their perfective variants (in those cases where a perfective option exists), we get the same effect. This is true of both verbal and adjectival participles, hence we classify them as factual imperfectives. (23) illustrates this for some of the examples in (14) and (15) (other examples that we identified as presuppositional imperfectives behave similarly).

	- b. (Po)kryt (pf)covered byl was dom house solomoj hay.instr […] 'The house was covered with hay.'
	- c. My we oba both byli were (po)striženy (pf)haircut.ipf nagolo bald […] 'We were both shorn / we both had shaven heads.'

### Olga Borik & Berit Gehrke

The meaning differences between imperfective and perfective participles are, as expected, very fuzzy and difficult to describe, since in all these cases we have one-time, completed events or states located in the past.

In the following, we will first briefly describe existential imperfective PPPs, although an account of this class is left for future research. Then we zoom in on the presuppositional ones and their analysis.

### **2.3.2 Existential imperfective past passive participles**

Typical imperfectivity-inducing contexts discussed in the literature include negation, repetition, and habituality. Some of the contexts in which we found imperfective participles could, in principle, be described as such. For example, (24) illustrates negated or negative events.

	- b. Mojka sink byla was perepolnena overflown.pf nemytoj unwashed.instr posudoj. dishes.instr Ne not myto washed.ipf bylo was davno. long-time 'The sink was overflowing with unwashed dishes. The dishes had not been done in a long time.'

The following examples involve event repetition (in the broadest sense), evidenced by pluractional markers (25) or markers of repeatability/iterativity (26) (in boldface).

	- b. Za for čto what **neodnokratno** not-once byla was bita beaten.ipf […] 'For what she was beaten more than once.'

### 3 Imperfective past passive participles in Russian

We propose that all these contexts have the informal characteristics of existential imperfectives, outlined in the previous section. In particular, they state that 'there were no events of that type (at some point in time or other)' (for the negated examples) and 'there were events of that type (at some point in time or other)' (for the other examples). We conjecture that among our previous examples, also (8) (negation) and (9) (event repetition) contain existential imperfectives, but we will leave this for further research. The main focus of this paper are presuppositional imperfective PPPs, to which we turn now.

### **2.3.3 Presuppositional imperfective past passive participles**

We argue that a prominent subset of the imperfective PPPs we found should be analyzed as presuppositional imperfectives, because they display hallmark properties of presuppositional imperfectives: Intonational focus is never on the verb but on some other element in the sentence, and a completed event is backgrounded and presupposed. In focus we find modifiers specifying the manner, quality, purpose or other aspect of the event itself (and not its culmination).<sup>9</sup> In fact, removing the modifiers sufficiently decreases the acceptance of these examples, though it might be possible to leave them out in the right context. Relevant examples are given in (27).

	- b. Zapiski notes byli were pisany written.ipf ne not dlja for pečati <sup>F</sup> print [… no…] but 'The notes were written not for print, but …'

The kind of background–focus division typical for presuppositional imperfectives, as described in the previous subsection, is thus also found in our examples. This information structure is frequently accompanied by a marked word order that has the participle (i.e. the backgrounded material) in sentence-initial topic position and the modifier (i.e. the focused material) at the end, after BE, or in some other prominent position, see (28a). This word order is marked with respect to the unmarked order of the participle following BE, which is otherwise much more frequent (recall our context count in the beginning of §2). More such examples are given in (28).

<sup>9</sup>An anonymous reviewer pointed out that our corpus only contains written texts so that we cannot know where focus is in these sentences. We are reporting here the native Russian intuitions of the first author of this paper.

### Olga Borik & Berit Gehrke

	- b. Znamenityj Famous pokojnik deceased.nom nesen carried.ipf byl was do until mogily grave [na on rukax] <sup>F</sup> arms […] 'The famous deceased was carried in arms until the grave.'

We also find this word order in examples already discussed, namely (7), (14a–14c), (15a), and (27a), which, we argue, also involve presuppositional imperfectives, evidenced by the focussed additional modifiers. However, this marked word order is not obligatory for presuppositional imperfective participles, as we see in (27b); what is relevant is the background–focus division described above. Finally, this marked word order is also found not only with presuppositional imperfectives. For example, in (25), which was argued to involve an existential imperfective, we find the same marked word order. This example is crucially different from the presuppositional imperfectives discussed here, though, in that there is no modifier in focus and instead the intonational focus is on the predicate.

### **3 The semantics of presuppositional imperfective past passive participles**

We propose to extend Grønn's (2003) account of presuppositional imperfectives, which originally only covered active cases and which was illustrated in (21), to passives.<sup>10</sup> For example, the analysis of the VP in (27a), repeated as (29), is given in (30).

(29) Stroeno built.ipf bylo was ėto that ploxo, badly xromo, lamely ščeljasto. with.holes 'It was built badly, lamely, with holes.'

<sup>10</sup>Note that Grønn (2003) acknowledges that factual IPFs are not restricted to past tense contexts but that he only concentrated on such contexts for convenience. In Grønn (2015) he briefly mentions other IPF forms that could be analyzed along the same line, including, e.g., past active participles like *čitavšij* 'having read'. Our contribution in this respect is that we broaden the empirical coverage to include the passive data that has previously gone unnoticed, due to the (we hope to have shown) erroneous assumption that IPF PPPs do not deserve a proper compositional analysis.

3 Imperfective past passive participles in Russian

(30) <sup>J</sup>VP<sup>K</sup> <sup>=</sup> *λe*[ | bad(*e*), lame(*e*), with holes(*e*)] [ | build(*e*)]

Under this analysis, the completion/culmination of the event is not part of the asserted meaning, and the imperfective shifts the focus to another aspect of the event, expressed by the modifier, instead of the culmination of the event itself.

The presuppositional account makes a number of predictions. One is that presuppositions project, in the sense that, e.g., negation affects only the asserted but not the presuppositional content. Thus, if the existence of a completed event is presupposed in the positive counterpart, as illustrated in (27), the same holds in a corresponding negated sentence in (31).

	- b. Zapiski notes ne not byli were pisany written.ipf ne not dlja for pečati print [… no but …] 'It is not the case that the notes were written not for print, but …'

From both the original and the negated examples we infer the existence of a (completed) event, and what is negated in (31) is only the contribution of the modifier.<sup>11</sup>

Furthermore, if our imperfective PPPs are indeed presuppositional, the presupposed events should be bound to a perfective in the context or justifiable by the input context, as we briefly discussed in §2.3.1. It is important to note at this point that many of Grønn's presuppositional imperfective examples in context do not pick up an identical perfective verb form, as in Grønn's (20), rather they seem to be merely 'justifiable in context', as in Grønn's (22). What does it mean, then, to be justifiable in context?

In the nominal domain, anaphora to previously introduced discourse referents can be expressed by pronouns or by definite descriptions. For example, in (32), the indefinite *a sister* in the first sentence introduces a new discourse referent. The second sentence shows that this discourse referent can be picked up by a pronoun, by a definite description with identical lexical material (*sister*), but also by a definite description that merely contains a related lexical noun, the hyperonym *girl*.

<sup>11</sup>The negated examples in (31) (in particular (31b) with the double negation) sound somewhat unnatural, due to the fact that sentential negation usually negates the whole predicate, including the event. Nevertheless, to the extent that they are ok, they still imply event completion.

### Olga Borik & Berit Gehrke

(32) Bruno has a sister that lives in London. He loves {her / his sister / the girl} a lot.

Definite descriptions (but not pronouns) can also be used as bridging anaphora, such as *the window screen* in (33).

(33) Carla was driving to work. The window screen was full of dead bugs.

In the verbal domain, pronominal (i.e. pro-verbal) anaphora do not really exist, apart maybe from the event kind anaphora *so/such*. Thus, presuppositional imperfectives have to be the event counterpart of definite descriptions. These pick up previously introduced event referents, either with identical lexical material or with a hyperonym or a hyponym. Alternatively, they are "justifiable by the context", which we then take to be parallel to bridging.

Do we find such anaphoric relations of our presuppositional imperfective participles in the broader contexts they appear in? Some examples showing that we do are given in (34).

	- b. Ėto this – ne not ja I **sdelal**, did.pf ėto this – **vedeno** led.ipf bylo was moeju my.instr rukoj! hand.instr 'It wasn't me who did that, it was orchestrated by me (lit. led by my hand)!'

Example (34a) is similar to Grønn's (22), in the sense that here the presuppositional imperfective participle *plačeny* 'paid' refers back to the event inside the related nominalization 'payment'. In (34b), the imperfective 'led' does not lexically repeat the perfective 'did'; nevertheless, we argue that semantically this is a subtype of doing event and thus a hyponym, so that we are again dealing with an anaphoric relation.

Finally, let us say a bit more about examples like (35) (and similarly 14a, 14b, 27b).

(35) **Pis'ma** letters ego his **pisany** written.ipf byli were černo black i and kruglo round […] 'His letters were written in black and round letters.'

### 3 Imperfective past passive participles in Russian

We suggest that in (35), the created object *pis'ma* 'letters' can serve as anaphor for the writing event. In this case, *pis'ma* also happens to be morphologically related to *pisat'* 'write' (similarly *za-pis-ki* 'notes' in (27b)), though this is obviously not a general requirement, see (14a) and (14b).

A future task will be to check the contexts more thoroughly and systematically to see which of our imperfective PPPs really involve presupposed events, and furthermore to provide an analysis of other occurrences of such participles that do not lend themselves to an analysis in terms of presuppositional imperfectives. As we hypothesized in §2.3, they might very well turn out to all be instances of the existential meaning of the imperfective aspect, but this will have to be confirmed in further research.

### **4 Conclusion and open issues**

In this paper we have shown, based on naturally occurring data, that there are fully compositional imperfective past passive participles in Russian, which occur in regular periphrastic passives (both adjectival and verbal). We therefore refuted the widespread assumption that such participles are non-compositional and should rather be analyzed as adjectives. We have shown that a representative subset of these participles come with a special information structure in which the verb is not accentuated but focus lies on a quasi obligatory modifier; this often comes with a marked word order in which the participle appears in sentenceinitial position or at least in a position before BE, and the modifier in focus after BE. We implemented these findings in an account of such participles as involving the presuppositional imperfective aspect, where the event (completion) is presupposed and thus backgrounded, signalled by the use of the imperfective.

Several issues remain. First, if the empirical finding reported in §2 is indeed correct, *why are there no (contemporary) secondary imperfective past passive participles*? According to Grønn (2003), there are no morphological or lexical restrictions on factual imperfectives, so that both simple as well as secondary imperfectives should be possible. An impressionistic view in the literature, however (see also discussion in Grønn 2003, ch. 4), is illustrated by the following quote from Comrie (1976: 118): "The use of the Imperfective as a general-factual is particularly common with non-prefixed verbs, and rather less common with Imperfective verbs that owe their imperfectivity to a suffix that derives them from a Perfective." At this point we can only speculate that presuppositional imperfectives are most common with simple imperfectives because these verb forms are morphologically the least marked for grammatical or lexical aspect, and pre-

### Olga Borik & Berit Gehrke

suppositional imperfectives generally do not focus on any aspectual meaning in particular. This line of argumentation, however, would not necessarily extend to existential imperfective participles. Another possibility could be that factual imperfectives historically first arose with a core group of imperfectives (which are all simple) and then spread to others; since imperfective PPPs are already quite restricted, maybe only the core verbs are affected. Yet another option could be that there is a real grammatical/morphological restriction on secondary imperfective PPP formation in Modern Russian (as opposed to earlier stages, as evidenced by our data), though we do not really know why that would be.

A further open issue is *why we do not find more cases of imperfective past passive participles*, i.e. why the number is so low, and why we find them more frequently only with a handful of verbs, as tentatively suggested in §2. The impression that many verbs of creation appear in this context could be due to the fact that we can infer the event already from the objects themselves, as alluded to at the end of §3. In addition, we have the intuition that passives are generally not that widely used in Russian, though we do not have statistical data to back this up. A potential (informal) explanation for this could be that in languages with a fixed word order, such as English, passives take on particular information structural functions that languages with a freer word order, such as Russian, can express in active sentences with different word orders. This, then, could lead to a more restricted use of the passive, so that it is only limited to aspectual/event structural functions (see Abraham 2006 for argumentation along these lines). Another restricting factor which is suggested by our analysis comes from the specific licensing requirements for the presuppositional imperfective passives: if the anaphoric treatment of the presuppositional meaning is correct, these passives can only appear in contexts which can provide a discourse antecedent for the passive sentence.

Finally, there is the issue of *cross-Slavic variation in the expression of passives*. From a cross-Slavic perspective, the aspectual restrictions on the formation of PPPs reported for Russian but partially refuted in this paper, is rather surprising. If we look at Czech, for example, PPPs can be derived from both imperfective and perfective verbs, across the board, and without the limited productivity of imperfective ones that we clearly find in Russian. Furthermore, such participles express verbal or adjectival passives, including passive "events in process" when we are dealing with imperfective ones (Radek Šimík, p.c.).<sup>12</sup> We can think of several possible research questions to be explored in this domain. One could be that languages with "fully productive" imperfective and perfective PPPs (e.g.

<sup>12</sup>Similarly, there are cross-Slavic differences in the properties of reflexive passives, which should also be taken into account; see Fehrmann et al. (2010) and Schäfer (2016) for further discussion.

### 3 Imperfective past passive participles in Russian

Czech) form regular periphrastic verbal passives with all imperfective and perfective meanings. For languages like Russian, then, two options are conceivable. According to the first, combinations of BE with PPPs are adjectival, and only reflexive passives are verbal. Given the availability of event token modification (recall §2.2), we find this option less convincing. The second option is that combinations of BE and past participles are either verbal or adjectival, but can only express result states (Kratzer's 2000 target states). Reflexive passives, then, which are always verbal, fill the gap, for verbs that do not have target states, as well as for passive event-in-process readings. Under this hypothesis, though, it is still unclear why the Russian periphrastic passive cannot have a process meaning, especially in the cases of verbal/eventive passives. However, there is a split in "imperfective meanings" conveyed by different passives, in the sense that the process meaning is only conveyed by reflexive passives but other, sometimes called "peripheral" imperfective meanings, specifically habituality/iterativity and (all types of) factivity, are expressed by periphrastic passives (and then usually with perfective participles). What seems to be needed to explain this distribution is a competition-based analysis, possibly launched in an optimality theoretic framework.

### **Abbreviations**


### **Acknowledgements**

This research has partially been funded by project FFI2014-52015-P from the Ministry of Economy and Competitiveness (MINECO) and 2014SGR 1013 (awarded by the Generalitat de Catalunya) (1st author). For feedback and discussion we thank especially Hans Robert Mehlig, Atle Grønn, and two anonymous reviewers, as well as the audiences at the HSE Semantics & Pragmatics Workshop (Moscow),

Olga Borik & Berit Gehrke

Event Semantics 2016 (Düsseldorf), FDSL 12, TELIC 2017 (Stuttgart), and Non-at-Issue Meaning and Information Structure (Oslo).

### **References**


### 3 Imperfective past passive participles in Russian


### Olga Borik & Berit Gehrke


### **Chapter 4**

## **Event and degree numerals: Evidence from Czech**

Mojmír Dočekal Masaryk University in Brno

Marcin Wągiel Masaryk University in Brno

> In this paper, we bring in novel data concerning the distribution and semantic properties of two classes of adverbs of quantification in Czech, i.e., event numerals such as *dvakrát* 'twice/two times' as opposed to degree numerals such as *dvojnásobně* 'doubly/twofold'. We explore the contrasts between the expressions in question including the interaction with comparatives and equatives as well as scope asymmetries. We propose that degree numerals target values on a provided scale and are, hence, best analyzed as predicates of degrees whereas event numerals have a more general semantics which primarily allows for quantification over individuated events, but also enables to operate on degrees.

**Keywords:** numerals, comparative, equative, degrees, scales, events, Czech

### **1 Introduction**

Lexicons of many natural languages distinguish between two types of expressions involving quantification which correspond to English adverbs such as*twice* and *doubly*, see (1). Surprisingly, though cardinal numerals have received a lot of attention in the semantic literature on quantification (Landman 2004, Ionin & Matushansky 2006, Hofweber 2005, and Rothstein 2012 among many others), expressions such as those in (1) remain strikingly understudied both from a de-

Mojmír Dočekal & Marcin Wągiel. 2018. Event and degree numerals: Evidence from Czech. In Denisa Lenertová, Roland Meyer, Radek Šimík & Luka Szucsich (eds.), *Advances in formal Slavic linguistics 2016*, 77–108. Berlin: Language Science Press. DOI:10.5281/zenodo.2554021

### Mojmír Dočekal & Marcin Wągiel

scriptive and theoretical perspective (with notable exceptions of Landman 2006, Bhatt & Pancheva 2007, and Donazzan 2013).<sup>1</sup>


The aim of this paper is to present novel data concerning the distribution and semantic properties of such expressions in Czech, exemplified in the text by (2). In recent years the meaning of different types of Slavic derived numerals has attracted considerable attention (see Dočekal 2012; 2013 for Czech, Wągiel 2014; 2015a for Polish, and Khrizman 2015 for Russian), and thus the analysis of the presented data regards a broader enterprise intended to examine numeral quantification from the perspective of morphologically complex languages.


In this paper, we will refer to Czech adverbs of quantification such as (2a) as event numerals (ENs), whereas expressions like (2b) will be called degree numerals (DNs). Our goal is primarily empirical, hence we will focus our attention on discussing novel data. More particularly, we will concentrate on constructions in which the degree argument is being manipulated, specifically on the interaction with comparatives and equatives. We claim that ENs are best analyzed as adverbs of quantification whose semantics is general enough to allow for counting distinctive events in terms of iteration as well as operations on degree intervals. On the other hand, DNs are in fact degree predicates which makes their distribution more restricted.

The article is outlined in the following way. In §2, we will discuss the distribution of Czech ENs and DNs based on the corpus study we have conducted. In §3, we will examine the key environments in which such expressions occur. In §4, we will focus on categorial and typal differences and we will bring in additional contrasts involving ENs and DNs whereas §5 will discuss the properties of adjectival and nominal DNs. §6 will summarize the data and in §7, we will propose a

<sup>1</sup>Wągiel (to appear) proposes an analysis of Slavic adjectival multipliers similar to English *double*, however, we are not aware of any semantic treatment of adverbial expressions corresponding to English *doubly*.

### 4 Event and degree numerals

predicative semantics for DNs as well as suggest an analysis of ENs. §8 concludes the paper.

### **2 Distribution**

At first blush, Czech numerals such as *dvakrát* 'twice/two times' and *dvojnásobně* 'doubly/twofold' appear to be synonymous in some contexts.

	- b. Ceny prices tady here jsou are **dvakrát** twice / **dvojnásobně** doubly vyšší higher než than tam. there 'The prices here are two times higher than there.'

However, a more careful investigation reveals that there are multiple environments in which they are not. In order to determine the distribution of ENs and DNs and to define the properties of the contexts in which they occur, we conducted a corpus study based on the Czech National Corpus (CNC).<sup>2</sup> The selected corpus samples contained 100 random occurrences of the EN *dvakrát* and the DN *dvojnásobně*, which were reduced to 98 and 99 occurrences, respectively, after filtering. Figures1 and 2 present the preferred environments in which the numerals in question appear in the samples.

The results show a significant difference in the distribution of ENs and DNs that, in our opinion, unveils the real nature of these expressions. Whereas in 77% of occurrences, *dvakrát* targets event-denoting VPs as well as temporal AdvPs and PPs,<sup>3</sup> *dvojnásobně* tends to modify comparatives, APs, and secondary predicates as well as degree-related VPs.<sup>4</sup> In total, it targets scales in 90% of the studied cases. The observed contrast suggests that *dvakrát* naturally favors event-denoting environments (though it can appear in comparatives and equatives) whereas *dvojnásobně* exhibits a very strong tendency to select for degree expressions.

<sup>2</sup>The CNC is a representative corpus of contemporary Czech. We have selected the SYN2015 subcorpus (Křen et al. 2015), which is the largest reference corpus of contemporary written Czech consisting of more than 100 million tokens. We searched for the lemmas *dvakrát* and *dvojnásobně*.

<sup>3</sup> Following Doetjes (2007), we assume that adverbials such as *dvakrát denně* 'twice a day' and *dvakrát za týden* 'twice a week' are similar to frequency expressions in the sense that their interpretation is dependent on the time interval they introduce.

<sup>4</sup>Out of 30 VPs modified by *dvojnásobně* 9 were headed by deadjectival verbs, e.g., *zvětšit* 'enlarge' and *zvýšit* 'raise', whereas 11 involved predicates inherently associated with scales including verbs operating on degrees such as *zvednout* and *vzrůst* 'increase'. The remaining 10 examples involved predicates such as *platit* 'pay', *trestat* 'punish', and *jásat* 'rejoice' which arguably at least to some extent also pertain to the notion of gradability.

Mojmír Dočekal & Marcin Wągiel

In the following sections, we will examine two contexts we assume to be crucial for understanding the character of the EN/DN alternation as well as further contrasts and differences between those expressions.

### **3 Key contexts**

### **3.1 Degrees and differentials**

The first environment to be discussed is constituted by degree constructions involving comparison. Both ENs and DNs can appear in comparatives as differentials, as attested by the examples from the CNC corpus in (4).

4 Event and degree numerals

(4) a. … je is dnes today až even **dvojnásobně** doubly větší bigger nebezpečí danger ničivých destructive povodní floods než than před before 20 20 lety. years (CNC) '… today, the danger of destructive floods is two times bigger than 20 years ago.'

b. … a and tak thus se refl dokážou manage **dvakrát** twice rychleji faster ohřát heat nebo or zchladit cool.down než than běžné ordinary žehličky. irons (CNC) '… and thus they can heat or cool down two times faster than ordinary irons.'

Furthermore, both ENs and DNs are unacceptable in superlatives.<sup>5</sup>

(5) \*Petr Petr je is **dvakrát** twice / **dvojnásobně** doubly nejvyšší. tallest

Nevertheless, an interesting contrast arises when we consider equatives. Though Czech ENs are perfectly fine in such an environment, see (6), DNs are significantly less acceptable in equatives than in comparatives, as witnessed by the oddity of (7b).<sup>6</sup> In addition, there are no attested occurrences of equatives with DNs in CNC.

	- b. Petr Petr je is **dvakrát** twice tak so vysoký tall jako how Marie. Marie 'Petr is twice as tall as Marie.'

<sup>5</sup>ENs may appear as superlative modifiers, e.g., in the past tense. However, a sentence such as *Petr byl dvakrát nejvyšší* 'Petr was the tallest twice' has only an event reading which states that there were two occasions on which Petr was the tallest one among the compared individuals. Therefore, it seems that in such cases the EN modifies the whole phrase, i.e., the copula and the superlative, rather than the superlative alone.

<sup>6</sup>A similar contrast between *twice* and *two times* in English has been observed in Gobeski (2011).

### Mojmír Dočekal & Marcin Wągiel

b. \* Petr Petr je is **dvojnásobně** doubly tak so vysoký tall jako how Marie. Marie

This property of DNs corresponds to the behavior of standard differentials, which, as indicated in (8), although frequently attested in comparatives, are not possible in equatives.<sup>7</sup>

	- b. \* Petr Petr je is **o** by **10** 10 **cm** cm tak so vysoký tall jako how Marie. Marie

These data seem to suggest that though both ENs and DNs can operate on scales, they differ in that they employ distinct strategies to modify the degree they target. On the basis of the presented evidence, we assume it is plausible to hypothesize that DNs share core semantic properties with differentials. On the other hand, the compatibility of ENs with equatives seems to imply that they are expressions of a very distinct type.

### **3.2 Count events**

The second key environment to be discussed here involves VPs referring to individuated count events. Multiple examples attested in the CNC corroborate the well-known fact that ENs can combine with VPs in order to quantify over eventualities. Interestingly, as witnessed by the ungrammaticality of (9b), DNs cannot be used to count events.

	- b. \* **Dvojnásobně** doubly se refl přesvědčím, I.will.ensure že that jsou are dvířka door zavřená. closed

Not surprisingly, neither ENs nor DNs modify VPs denoting homogeneous eventualities such as static states, as demonstrated in (10). As expected, no such examples were found in the CNC samples.

<sup>7</sup>As an anonymous reviewer points out it seems that (8b) is out because equatives need to apply the AP internally, before the degree variable *d* is bound, for instance by the POS operator (e.g., Kennedy & McNally 2005).

4 Event and degree numerals

(10) \* Petr Petr **dvakrát** twice / **dvojnásobně** doubly zná knows Marii. Marie

Another observation concerns VPs referring to values on scales. While both ENs and DNs can modify verbs such as *vzrůst* 'increase', there is an asymmetry with respect to possible readings of sentences containing such phrases. Let us consider the contrast between (11b) from the CNC and the corresponding example in (11a). As indicated in the translation, (11a) is ambiguous between the quantifieddegree and the quantified-event interpretation, i.e., it is either true of a scenario where the demand increased by two times irrespective of the number of times it increased, or of a situation where there were two events of increasing the demand, irrespective of the value by which the demand was increased. Crucially, (11b) lacks the quantified-event interpretation and can only be true of a scenario in which the degree of increase was multiplied by two.

	- demand after subsidies increased doubly 'The demand for subsidies increased doubly.'

The discussed observations further support the semantic nature of the EN/DN alternation. At this point, it seems innocuous to state that the distinction relies on the strategy the expressions in question make use of in terms of quantification. Whereas DNs are unable to count events and are restricted to operations on degrees, ENs seem to employ a more general semantics which allows for quantification over both events and degrees. Further differences will be examined in the next section.

### **4 More contrasts**

### **4.1 Categorial differences**

Another difference between Czech ENs and DNs concerns their derivational potential. Both classes involve morphologically complex expressions derived from a numeral root, e.g., *dv-* (corresponding to English *tw-*), by different suffixes, i.e., *-krát* and *-násobn-*. <sup>8</sup> However, the contrast between (12) and (13) indicates an ap-

<sup>8</sup> In fact, *-násobn-* can be further decomposed at least to *-násob-*, as attested in *násobit* 'multiply', and *-n-*. For the sake of simplicity, we will ignore the morphological complexity here.

### Mojmír Dočekal & Marcin Wągiel

parent categorial asymmetry. Unlike DNs, which employ distinct morphology to display a broad range of syntactic categories including adverbial, adjectival, and nominal forms (all derived from the same stem), ENs are defective in the sense that they have only adverbial forms and cannot appear in syntactic contexts which are sensitive for adjectives and nominals.<sup>9</sup>


Although the categorial asymmetry provided in (12) and (13) may suggest that ENs and DNs are exponents of distinct semantic objects, as such it is, of course, insufficient to draw a typal distinction between the two. In the next section, we will investigate such a possibility in more detail.

### **4.2 Typal compatibility**

A further observation concerns the fact that ENs and DNs in Czech can be stacked, as witnessed by the grammaticality of examples such as (14a). This suggests that Czech expressions of those kinds are compatible in terms of their semantic types. Moreover, the reversed order of numerals, as provided in (14b), is not possible, which further suggests different input requirements.

(14) a. Petrovi for.Petr se refl to this **třikrát** thrice **dvojnásobně** doubly vyplatilo. paid.off 'For Petr it paid off doubly three times.'

<sup>9</sup> It should be noted that the inability of ENs to take adjectival and nominal morphology seems to be a Czech idiosyncrasy since, for instance, Polish allows for forms such as *dwukrotny* 'twice.A' and *dwukrotność* 'twice.N'. Similar, there is adjectival *dvukratnyj* in Russian and *dvakratni* in Slovenian. However, a detailed cross-linguistic comparison of ENs and DNs is beyond the scope of this paper and constitutes a challenge for further research.

4 Event and degree numerals

b. \* Petrovi for.Petr se refl to this **dvojnásobně** doubly **třikrát** thrice vyplatilo. paid.off

Furthermore, there is solid evidence that unlike ENs, DNs are anchored to a particular event. Let us consider possible interpretations of a sentence such as (15) in which the conjoined NP in subject position denotes a plurality of entities whereas the modified VP refers to a plurality of events. As indicated in (15a) and (15b), the sentence can either have a distributive reading where the events of paying off doubly are distributed equally onto each of the individuals, i.e., Petr and Honza, or a collective reading in which it payed off doubly three times for Petr and Honza as a group. Moreover, a cumulative interpretation as in (15c) is also possible. In such a scenario there was a total of three events of paying off doubly and Petr and Honza share the total gain disproportionately. Nevertheless, (15) cannot have a meaning such as the one in (15d) or in (15e). It is impossible to understand the sentence in such a way that the total gain corresponds to six units, similar to (15b) or (15c), but the total number of events is less or greater than three. Such cumulations are simply unaccessible which implies that DNs cannot outscope the event quantifier and are forced to operate on degrees within a particular event.

	- a. for Petr: 3 × (it-paid-off × 2) + for Honza: 3 × (it-paid-off × 2)
	- b. for Petr+Honza: 3 × (it-paid-off × 2)
	- c. for Petr: 2 × (it-paid-off × 2) + for Honza: 1 × (it-paid-off × 2)
	- d. \* for Petr+Honza: 2 × (it-paid-off × 3)
	- e. \* for Petr: 4 × (it-paid-off × 1) + for Honza: 1 × (it-paid-off × 2)

The data clearly demonstrate that adverbial ENs and DNs differ with respect to their semantic type and scopal properties. The following sections will explore some additional semantic phenomena related to adjectival and nominal forms of DNs.

### **5 Adjectival and nominal degree numerals**

### **5.1 Quantification over amounts and values**

Let us now consider Czech adjectival DNs such as *dvojnásobný* 'double/two-time'. The CNC data confirm our intuition that such expressions often modify amount

Mojmír Dočekal & Marcin Wągiel

nominals and nouns implicitly associated with scales like those in (16).<sup>10</sup> In those contexts, the DN appears to multiply a contextually provided value on a particular scale. As a result, the predicates in (16) are true of a twice as high volume and a twice as high salary, respectively.

	- b. dvojnásobný double plat salary 'double the salary'

Interestingly, adjectival DNs are not compatible with container nouns, as the contrast between (17a) and (17b) shows. This property differentiates them from basic cardinal numerals, since in order to quantify over amounts determined by container nominals Czech requires cardinals to do the job, see (17c). Czech cardinals, however, are unable to combine with amount nouns to count quantities, as witnessed by the ungrammaticality of (17d).

	- b. \* dvojnásobný double hrnek cup čaje tea
	- c. dva two hrnky cups čaje tea 'two cups of tea'
	- d. \* dvě two množství amount čaje tea

The data discussed above show that DNs and cardinals are in complementary distribution with respect to container and amount nouns. This fact suggests that the two types of expressions in question make use of distinct quantificational strategies and should be analyzed differently.

<sup>10</sup>In the CNC, among the 15 most frequent collocation candidates for the lemma *dvojnásobný* (1,567 occurrences in SYN2015) one can find the followings nouns: *počet* 'number', *množství* 'amount', *cena* 'price', and *rychlost* 'speed'.

### 4 Event and degree numerals

### **5.2 Events and social roles**

Amount nouns do not exhaust the combinatorial potential of adjectival DNs since they can also modify two other classes of expressions, specifically nominals referring to events, as exemplified in (18a), as well as nominals denoting social functions such as, e.g., family roles and public capacities, see (18b).<sup>11</sup> Nevertheless, the interpretation of such phrases differs from the meaning of, e.g., (16), in which the DN seems to merely multiply the value indicated by the implicit degree argument of the amount nominal. For instance, (18a) refers to a set of murdering events involving two victims in each such an event, i.e., the DN seems to access an internal argument of the deverbal nominal. On the other hand, similar to what was observed in Wągiel (2015b) examples such as (18b) denote a set of individuals that have gained a particular property two times – in this case, the property of becoming a champion.<sup>12</sup>

	- b. dvojnásobný double mistr champion 'two-time champion'

Further evidence that amount NPs and nominals implicitly associated with scales substantially differ from nominals denoting events or social roles modified by adjectival DNs comes from the distribution of nominal DNs such as *dvojnásobek* 'double.N'. As demonstrated in (19), such nominalizations cannot take expressions referring to events or social roles as their complements though they frequently combine with amount nominals.<sup>13</sup>

	- b. \* dvojnásobek double.N vraždy murder / mistra champion

<sup>11</sup>The CNC collocation candidates list includes, among others, the following examples for the first class: *vražda* 'murder', *přesilovka* 'power play', and *radost* 'joy', as well as *vítěz* 'winner', *matka* 'mother', and *účastník* 'participant' for the latter.

<sup>12</sup>Notice that such behavior seems to be a Czech idiosyncrasy since many other languages make use of a different adjective to express such a meaning, e.g., see the English translation in (18b).

<sup>13</sup>For instance, the CNC lists the following among the 15 most frequent collocation candidates for the lemma *dvojnásobek* (845 hits in SYN2015): *cena* 'price', *částka* 'sum of money', *počet* 'number', and *velikost* 'quantity'.

### Mojmír Dočekal & Marcin Wągiel

Moreover, the asymmetry is further supported by the contrast in (20). In such examples, *je* 'is' is not used as a copula of predication, but rather it seems to establish the identity relation between the denotation of its complement and that of the subject NP.<sup>14</sup> In (20a), the definiendum, i.e., the modified degree noun, is associated with the definiens comprising the comparative construction. On the other hand, (20b) and (20c) are odd since neither *mistr* 'champion', nor *sebevražda* 'suicide' provides a degree argument to be accessed by the DN, and thus the subject NPs are not equivalent to the corresponding comparatives. In other words, since the subjects and the nominals within the matrix predicates in (20b) and (20c) refer to different entities, establishing the identity relation is impossible.<sup>15</sup>

	- b. # **Dvojnásobný** double mistr champion je is **dvakrát** twice větší bigger mistr. champion
	- c. # **Dvojnásobná** sebevražda je **dvakrát** větší sebevražda. double suicide is twice bigger suicide

The contrasts described above indicate that adjectival and nominal DNs display heterogeneous behavior in interaction with NPs implicitly associated with scales on the one hand and with event and social role nominals on the other. Possibly, the relationship between the two types of phrases is much less straightforward than it might initially seem. In this paper, however, we are primarily concerned with examples such as (16a) and we assume that use of adjectival DNs to be the basic one.

### **5.3 Predicate position**

Finally, the last observation concerns the attributive and predicative use of adjectival DNs. In all the examples provided in the previous sections, *dvojnásobný* appears as a nominal modifier which seems to be the most natural syntactic context for such an expression. Nevertheless, it is not unusual to find *dvojnásobný* in

<sup>14</sup>Note that (20a) is ungrammatical with the instrumental *rychlostí* 'speed.ins', which is commonly associated with predication.

<sup>15</sup>This property seems to resemble some sort of a monotonicity constraint, as discussed in Schwarzschild (2002). However, the exact nature of this phenomenon requires further investigation.

### 4 Event and degree numerals

predicate position as well, as attested in an CNC sentence in (21). Here, the DN serves as the main predicate of a sentence and assigns a property to a subject denoting an amount, i.e., *hodnota* 'value'. In particular, it is predicated of the value of saved property it is twice as high relative to the value corresponding to the damages, i.e., it amounts to 100 000 CZK.


Sentences such as (21) are far less frequent in the CNC than examples with adjectival DNs in attributive position. However, we regard their existence as an important piece of evidence, supporting the predicative nature of DNs.

### **6 Data summary**

Before we move on to the analysis of the EN/DN distinction, let us briefly recapitulate the empirical findings. Table 1 summarizes the observed contrasts.<sup>16</sup> In brief, ENs are able to target both events and degrees. They have only adverbial forms and tend to appear in eventive environments though they can also modify degree constructions including comparatives and equatives. On the other hand, DNs cannot scope over events and they heavily favor scalar contexts excluding equatives. Not only can they take adverbial and nominal, but also adjectival morphology and as such they can quantify over amounts, arguments of events, as well as time intervals associated with social roles specified by nominals they modify. In the next sections, we attempt to account for at least some of the puzzling differences between the two classes of expressions in question. We will propose an analysis of adverbial DNs and suggest possible directions of development to account for the meaning of ENs as well as adjectival DNs.

<sup>16</sup>The most frequent environments based on the CNC corpus study are in bold.

### Mojmír Dočekal & Marcin Wągiel


Table 1: Properties of event and degree numerals

### **7 Proposal**

### **7.1 Degree numerals**

On the basis of the distributional evidence, we argue that the comparative examples introduced in §3.1 reveal the true nature of DNs. Let us now consider more closely the example in (7a), repeated here as (22). The truth conditions of the sentence are specified informally in (22a) and (22b) gives an exemplary situation in which the sentence would be true.

	- a. True in all situations where the height of Petr is qual to the height of Marie multiplied by 2
	- b. *µ*HEIGHT(Petr) = 180 ∧ *µ*HEIGHT(Marie) = 90

Building on the observations discussed in §3.1, we acknowledge that DNs seem to behave similarly to differentials in that they define the difference between compared values on a scale provided by the comparative. Nonetheless, we argue that the underlying mechanism which yields such a result is distinct. DNs differ from typical differentials in that they do not determine the gap in terms of some absolute value, e.g., 10 cm as in (8a). Instead, they provide information about the degree corresponding to a correlate in terms of the value related to a standard of

### 4 Event and degree numerals

comparison. For instance, in (22) the DN specifies the height of the correlate, i.e., Petr, in terms of the multiplied height of the standard of comparison, i.e., Marie.

We are now ready for the first approximation. Based on the observation discussed in §5.3, namely that *dvojnásobný* can occur in predicate position, see (21), we propose that the primary interpretation DNs have is the predicative one. Furthermore, based on the morphological evidence examined in §4.1, we assume that Czech DNs are compositional. We posit that numeral roots simply refer to numbers modeled as abstract entities and as such are expressions of type *n*. On the other hand, the suffix *-násobn-* introduces an operation involving multiplication of a degree by a number denoted by the root. Therefore, we model DNs as degree predicates, i.e., expressions denoting a characteristic function of degrees (type ⟨*d*,*t*⟩). We posit that such a function yields the truth value True iff a selected degree *d* is two times higher than some contextually determined value *д*. The semantics for *dvojnásobný* is proposed in (23a) whereas (23b) gives the abstracted meaning of DNs in general.


Let us now consider how (23a) accounts for the meaning of (21).The denotation of the subject NP (an expression of type *d*), i.e., the value of saved property, has the property of being equal to the value corresponding to the damages multiplied by two. The logical type of the DN is ⟨*d*,*t*⟩, hence the composition of (21) proceeds via the standard rule of Function Application. The predicate of degrees is applied to the degree denoting subject (type *d*) and after the degree variable is saturated a truth value is obtained.

### **7.1.1 Comparatives**

Before we demonstrate how the proposed semantics fits into the big picture involving comparatives and equatives, let us introduce several assumptions concerning gradability and comparison. First of all, we adopt the standard view and assume an ontology including degrees, i.e., objects of a primitive type *d*, which are ordered into scales. A scale is modeled as a triple ⟨*D*, >,*DIM*⟩ where *D* is a set of degrees, > is an ordering relation on *D*, and *DIM* represents a dimension of measurement such as height or weight. Notice, however, that we embrace the interval-based approach to degrees (e.g., Kennedy 2001 and Schwarzschild & Wilkinson 2002).

### Mojmír Dočekal & Marcin Wągiel

Second, following Solt (2015) we assume that individuals are associated with scales via measure functions that map an entity to the unique degree on the scale corresponding to the particular dimension. For instance, the measure function *µ*HEIGHT yields the measure of an individual with respect to the dimension of height. Thus, the semantics of a gradable adjective such as *tall* looks like (24).

$$\text{(24)}\quad \left[\text{tall}\right] = \lambda d \lambda \mathbf{x} \left[\mu\_{\text{\*\*\*}}(\mathbf{x}) \ge d\right]$$

However, we slightly diverge from the standard semantics for comparatives (e.g., von Stechow 1984, Heim 2000, and Schwarzschild 2008) in that we model the comparative marker in constructions such as (22) as involving the ≥ (rather than >) relation between maximal degrees corresponding to compared entities on a provided scale, as in (25) (a similar treatment of *-er* in English percentage differential comparatives was assumed by Gobeski & Morzycki 2017). What is important is that the ≥ relation may be pragmatically strengthened to = unless a suitable context prevents strengthening. We will discuss this issue in more detail below.

$$\text{(25)} \quad \left[\text{-er}\_{\lambda}\right] = \lambda D'\lambda D[\mathsf{M}\mathsf{A}\mathsf{x}(D) \geq \mathsf{M}\mathsf{A}(D')] \quad \text{ } \qquad \text{type } \langle\langle d,t\rangle, \langle\langle d,t\rangle, t\rangle\rangle\rangle$$

Furthermore, we assume the standard syntactic analysis of comparatives. In particular, we adopt the so-called small DegP view on which the comparative marker *-er* and the than-clause form a constituent at LF and the entire DegP serves as an argument of the gradable predicate (e.g., Heim 2000), as illustrated in (26).

Finally, following Pancheva (2006) we assume that Slavic comparatives such as (22) involve an elided clause introducing the maximal interval corresponding to a standard of comparison on a proper scale. Within such an approach, Czech clausal comparatives like (27a) are analyzed as in (27b).

4 Event and degree numerals

	- b. LF: [IP [IP Petr is *d*1-tall] [DegP -er<sup>1</sup> [PP than [CP Marie is *d*-tall]]]]

In the assumed structure, the comparative morpheme is interpreted as a quantifier over degrees, i.e., it that takes a set of degrees and returns a function from a set of degrees to a truth value (type ⟨⟨*d*,*t*⟩, ⟨⟨*d*,*t*⟩,*t*⟩⟩). As discussed in detail in Pancheva (2006), such typing is incompatible with the denotation of the thanclause since as a free relative it is interpreted as a definite description, i.e., a degree denoting expression of type *d* (Heim 2000). To remedy such a type clash, some approaches (e.g., von Stechow 1984 and Rullmann 1995) attribute a nontrivial semantics to *than*. <sup>17</sup> We follow this line of analysis. In particular, we adopt Pancheva's (2006) treatment of *than* as a partitive preposition in the domain of degrees which in clausal comparatives gets the semantics in (28).

$$\text{(28)}\quad \left[\text{than}\right] = \lambda d'\lambda d \left[d \text{ is part of } d'\right] \quad \text{(type } \langle d, \langle d, t \rangle \rangle\text{)}$$

In prose, *than* takes a denotation of a free relative clause, i.e., a degree *d*, and yields a set of degrees which *d* is member of. For instance, if the standard of comparison in (27a), i.e., Marie, corresponded to, e.g., 170 cm, then the entire than-clause would denote a set of degrees in the interval between 0 and 170 on the scale of height calibrated in centimeters. In terms of semantic types, the result of *than* being applied to the standard of comparison is an expression of type ⟨*d*,*t*⟩ which can serve as the first argument of the comparative morpheme. We assume that the same mechanism applies to the Czech preposition *než* 'than'.

With all the ingredients in place, let us now consider how the pieces fit together. Assuming that Heim & Kratzer's (1998) rule of Predicate Modification applies also to degree predicates, the adopted analysis creates a plausible attachment site for DNs. Since they are expressions of type ⟨*d*,*t*⟩, we propose that they can modify the PP node resulting in a syntactically more complex argument for Deg, as illustrated in the tree in (29). Crucially, the derived expression is also of type ⟨*d*,*t*⟩ which is suitable for the interpretation by the comparative morpheme.

<sup>17</sup>This contrasts with the standard view assuming that *than* is semantically vacuous (e.g., Heim 2000, Kennedy 2001, and Schwarzschild & Wilkinson 2002).

Mojmír Dočekal & Marcin Wągiel

The composition proceeds as follows. The preposition *než* takes the maximal interval to which Marie is tall as its input and yields a set of degrees which are part of that interval. Subsequently, the DN combines with the PP via Predicate Modification, and thus multiplies each member of the set by two. The output is a set of intervals that are two times bigger than the intervals corresponding to Marie's height and can serve as the first argument of the comparative morpheme *-ší*. The comparative morpheme applies the maximization operation MAX which picks the degree, i.e., the maximal interval, to which Marie is tall multiplied by two. As a result, the whole sentence is true iff the degree on a scale of height corresponding to the correlate, i.e., Petr, is equal or exceeds the value corresponding to Marie, as stated in the truth-conditions in (30a). However, this is not the way one would normally understand a sentence such as (22). In order to account for that deficiency, we propose that (30a) gets strengthened to (30b), i.e., the ≥ relation is replaced by =, which finally gives rise to an expedient result. We assume that the pragmatic enrichment results from a scalar implicature, a consequence of the competition between *dvojnásobně* and higher DNs similar to what has been proposed in the neo-Gricean theories of cardinals (e.g., Horn 1972).

(30) <sup>J</sup>(22)<sup>K</sup> <sup>=</sup> a. MAX( *λd*[*µ*HEIGHT(Petr) ≥ *d*] ) <sup>≥</sup> MAX( *λd* ′ [*d* ′ <sup>=</sup> <sup>2</sup> <sup>×</sup> *<sup>µ</sup>*HEIGHT(Marie)]) b. ↝ MAX( *λd*[*µ*HEIGHT(Petr) ≥ *d*] ) = MAX( *λd* ′ [*d* ′ <sup>=</sup> <sup>2</sup> <sup>×</sup> *<sup>µ</sup>*HEIGHT(Marie)])

### 4 Event and degree numerals

On the other hand, in a sentence such as (31a) where *aspoň* 'at least' prevents from the pragmatic inference the unstrengthened meaning unearths and we obtain the *at least* interpretation given in (30a). The lack of pragmatic enrichment in such examples is parallel to well-studied cases like *more than three boys* where the modified numeral never gives rise to a scalar implicature (see, e.g., Krifka 1999 and Schulz & van Rooij 2006). Another observation concerns the disappearance of scalar implicatures in downward-entailing contexts, as in (31b). Unlike (22), (31b) does not suggest that Petr's height cannot correspond to Marie's height multiplied by three or more. We regard it as an argument in favor of the competition account resulting in the strengthening of (30a) to (30b).

	- b. Petr Petr **není** isn't **dvojnásobně** doubly vyšší taller než than Marie. Marie 'Petr is not two times taller than Marie.'

The developed account seems to deliver desirable results. Not only have we provided an explanation of the semantic composition of DNs within the structure of the DegP but also we have proposed a plausible analysis of how comparatives modified by DNs are being interpreted.

### **7.1.2 Equatives**

So far we have demonstrated how our proposal accounts for the interaction between DNs and comparatives. Let us now turn to one of the main puzzles of the paper, namely the incompatibility of DNs with equatives, as witnessed by the ungrammaticality of (7b) repeated here as (32).

(32) \* Petr Petr je is **dvojnásobně** doubly tak so vysoký tall jako how Marie. Marie

We assume that similar to comparatives equative sentences involve a CP with elided material. Unlike comparatives, however, equatives lack an element such as *than* which would shift the type of a free relative of degrees to ⟨*d*,*t*⟩. Therefore, at LF an equative sentence such as (33a) gets the structure in (33b) where the DegP takes the CP as its argument directly (see Gobeski & Morzycki 2017 for a similar analysis of equatives).

### Mojmír Dočekal & Marcin Wągiel

	- b. LF: [IP [IP Petr is *d*1-tall] [DegP as… as<sup>1</sup> [CP Marie is *d*-tall]]]

Additional evidence suggesting that the proposed analysis is on the right track comes from the morpho-syntax of Slavic equatives. In Czech, the equative contains only the wh-element *jako* 'how' and the non-obligatory demonstrative pronoun *tak* 'so' (lit. 'like this') which precedes the adjective. Unlike in the comparative, there is no preposition or complementizer.

The final assumption concerns the denotation of the equative marker. We follow the standard view that the meaning of *as… as* differs from the semantics of the comparative morpheme. However, we argue that it is not the case that the only difference between the two lies in employing the = or ≥ relation instead of >, as often assumed (see Rett 2015). On contrary, we propose that unlike *-er* which requires a set of degrees as its first argument, see (25), *as… as* yields a function from sets of degrees to truth values for a particular degree (type ⟨*d*, ⟨⟨*d*,*t*⟩,*t*⟩⟩), see (34). In other words, the equative operates on the maximal interval associated with a standard of comparison rather than on a set of degrees. This seems intuitively correct since equative constructions appear to evaluate values with respect to a particular degree rather than to a set of intervals. We assume the same applies to Czech *tak… jako* 'as… as'.

$$\text{(34)}\quad \left[\text{as... as}\right] = \lambda d \lambda\\D\left[\text{ма}\times(D) = d\right] \quad \text{ }\quad \text{type } \langle d, \langle \langle d, t \rangle, t \rangle \rangle$$

Given the components discussed above, the reason why DNs are incompatible with equatives is simply because of type mismatch. Consider the structure of the DegP illustrated in (7.1.2). Since the equative does not involve the node of type ⟨*d*,*t*⟩ but rather the CP of type *d*, the DN cannot combine with any expression within the DegP via Predicate Modification. In principle, Function Application would still be applicable. Nevertheless, if a definite description denoted by the CP saturated the degree variable, the resulting expression could not combine with the equative marker. In any case, the derivation of (7b) would inevitably crash.

### 4 Event and degree numerals

At this point, we consider the main puzzle of the paper solved. The (in)compatibility of DNs with comparatives and equatives is essentially type-driven. DNs are of type ⟨*d*,*t*⟩, and thus in comparatives they modify the than-clause of the same type. On the other hand, since there is no such node available in equatives, DNs cannot find a plausible attachment site which leads to type mismatch and unacceptability of sentences such as (7b). In §7.2.2, we will demonstrate that ENs, unlike DNs, can appear in both comparatives and equatives due to the fact that they are of a different semantic type. However, before we move to *dvakrát*, let us briefly discuss adjectival DNs such as *dvojnásobný*.

### **7.1.3 Adjectival degree numerals**

So far, the proposed semantics for DNs seems to work well. However, it is insufficient to account for the data which involve adjectival *dvojnásobný* modifying event and social role nominals, as discussed in §5.2. Inspired by Rett's (2014) M-Op*<sup>e</sup>* and M-Op*<sup>d</sup>* operators, we propose that the analysis of DNs can be extended by adopting operations which introduce mappings between entities, events, degrees, and time intervals.

In general, quantified NPs exhibit an individual/degree polysemy (Rett 2014). This is also true of Czech NPs modified by cardinal numerals. (36a) has an individual reading in which five individuated portions (or sorts) of beer were such that they were top-fermented. On the other hand, (36b) refers to an amount of beer rather than to particular entities (or sorts).

	- b. Pět five piv beers {bylo was pro for Karla Karel dost enough / Karlovi for.Karel stačilo}. was.enough 'For Karel, five beers were enough.'

### Mojmír Dočekal & Marcin Wągiel

For DNs, we assume that the degree interpretation is the primary one, as in (37) where adjectival *dvojnásobný* modifies the amount nominal *plat* 'salary' in order to multiply the relevant degree.<sup>18</sup> Apart from the data already introduced in favor of such a claim, further evidence comes from the fact that DNs can target gradable nouns such as *idiot* (see Morzycki 2009), as indicated in (38) which is an example attested in the CNC. The second clausal conjunct asserts that the speaker attributes to themselves the level of idiocy which is twice as high as the contextually relevant value. It is the internal degree argument of the predicate *idiot* that is targeted by the DN.


Similarly, in the case of modified measure nouns such as *dvojnásobný objem* 'double volume', see (16a), we assume that the DN quantifies over the degree though it does not supply the dimension *µ*. The relevant dimension always seems to be provided by the modified predicate. For instance, in a phrase such as *dvojnásobně velký* 'twice as big' (lit. 'doubly big') it is the adjective that feeds the adverbial DN with the dimension of size. Likewise, in NPs such as *dvojnásobná délka* 'double the length' and *dvojnásobný idiot* 'double idiot' the measure noun and the gradable noun supply the dimensions of length and idiocy, respectively. In such examples, the DN simply multiplies values on a proper scale, hence it seems that the proposed degree semantics can be extended straightforwardly to capture such cases. We assume that the core of the analysis of *dvojnásobně* given in (23) would carry over to examples such as (38). In such cases, the DN predicates of a degree supplied by the adjective, measure noun, or gradable noun. However, due to the lack of space we have to postpone a thorough implementation of the general idea. Instead, in the next section we will try to suggest a way of dealing with the data that pose a more serious challenge.

<sup>18</sup>We assume that the composition involves at least the following steps: (i) modification of the amount noun (type ⟨*d*,*t*⟩) by the DN via Predicate Modification and then (ii) type-shifting of the entire phrase to the type *d* via the *ι* operation.

### 4 Event and degree numerals

### **7.1.4 Events and social role interpretations**

In order to account for examples such *dvojnásobná vražda* 'double murder' and *dvojnásobný mistr* 'two-time champion', see (18), we assume mappings between events and entities on the one hand and entities and times on the other. Let us start with proposing a treatment for the social role interpretation. In such cases there is no internal degree argument the DN could target. Therefore, in order to approach, e.g., (18b), we adopt the notion of time trace function (e.g., Krifka 1989 and Lasersohn 1995). A standard time trace function is an operation which maps an event onto its running time, i.e., the smallest time at which it occurs. For our purposes, however, this is insufficient since in order to explain the behavior of phrases such as (18b) we need to relate events with entities. Therefore, we assume a mapping of a property *P*, in this case, the property of being a champion, onto its running time, i.e., the time of being a champion. Consequently, the DN counts the introduced running times which results in the predicate true of entities that repetitively gained the property of being a champion.

The proposed approach predicts that the time reading can only be obtained for nominals denoting properties which are constrained in time, i.e., either lowerbound, as in the case of *champion*, or bilaterally bound in the case of, e.g., *president*. In other words, adjectival DNs are only possible with nominals denoting a property which can be felicitously associated with fluctuation within the dimension of time (Wągiel 2015b). For this reason, the phrases in (39) constitute weird expressions.

(39) # dvojnásobný two-time Čech Czech.person / člověk human / pes dog

However, the interpretation of modified deverbal nominals such as *dvojnásobná vražda* 'double murder', see (18a), cannot be explained in terms of time trace function. In this case, we assume a mapping between properties of events and entities related to those events as themes, i.e., such a function for a particular event would return its themes. As a result, the two victims reading is obtained.

### **7.2 Event numerals**

Our proposal concerning ENs builds on the classification developed by Doetjes (2007) who on the basis of French data draws a distinction between two classes of adverbs of quantification, namely degree expressions such as *a lot* and frequency adverbs such as *often*. According to this view, the division follows from the fact that the first involve degree modification whereas the latter quantify over times.

### Mojmír Dočekal & Marcin Wągiel

### **7.2.1 Frequency and scope**

At first sight, ENs seem to be similar to frequency adverbs since they both imply iteration and, unlike degree expressions, can scope over indefinites. The data in (40) illustrate the distinction between frequency and degree adverbs in Czech. Since a similar contrast regards ENs and DNs, as demonstrated in (41), it might seem appealing to simply claim that they are representatives of the corresponding classes.

	- b. \* Petr Petr **hodně** a.lot kupoval bought.ipfv nějaké some pivo. beer
	- b. \* Petr Petr **dvojnásobně** doubly koupil bought.pfv nějaké some pivo. beer

According to von Fintel (1994), frequency adverbs including ENs can be analyzed as expressions which quantify over situations and contain a hidden domain anaphor. Following Doetjes (2007) in assuming an abstract restrictor *times*, it is possible to analyze ENs as in (42). The example in (41a) would then be interpreted as (43) which is true of two buying events in which Petr is the agent and beer is the theme of that event.


However, as Doetjes (2007) herself observes, there is a scopal asymmetry between expressions such as *often* and ENs, specifically frequency adverbs can have a relational reading whereas ENs cannot. For instance, in (44) the frequency adverb *často* 'often' can be interpreted either as having a wide or a narrow scope relative to *když* 'when'. The relational reading in (44a) could be paraphrased as 'often when he was in Budapest, Karel visited Gellért'. On the other hand, the non-relational reading in (44b) would be interpreted as 'Whenever he was in Budapest, Karel often visited Gellért'. Crucially, (45) has only the interpretation in

### 4 Event and degree numerals

(45b) and cannot mean something like 'Twice when he was in Budapest, Karel visited Gellért'.

	- b. when > often
	- a. # twice > when
	- b. when > twice

Doetjes (2007) attributes the lack of relational reading to the incompatibility of ENs with the stative interpretation. However, ENs differ significantly from frequency adverbs in yet another respect, i.e., they are compatible with comparatives and equatives and can access internal arguments of degree verbs, as discussed in §3.1 and §3.2. On the other hand, frequency adverbs cannot target scales of degrees, e.g., (46) cannot mean that the height of Petr exceeds/equals the height of Marie multiple times. The only possible reading would be that there are many happenings in which Petr is taller/as tall as Marie which is a very strange interpretation. Similar, (47) can only mean that there were multiple events leading to an increase of the demand, i.e., the degree reading is unavailable.


In light of the discussed data, we argue that the assumption that ENs simply quantify over times (which implies iteration) is insufficient to explain all the observed contrasts. Instead, we propose that there is a scale of adverbs of quantification with respect to how wide scope they can take, see Table 2. In particular, degree adverbs including DNs have the narrowest scope, ENs rank in the middle since they can scope over indefinites, and finally frequency adverbs can have the widest scope resulting in the possibility of relational readings but cannot access

### Mojmír Dočekal & Marcin Wągiel

internal arguments of degree predicates. Here we see a promising correlation, specifically, the scope of an expression seems to correspond to its sortal polymorphicity. At this point, we can only speculate on what the cause and what the consequence is, and hence we remain agnostic with respect to the nature of the relationship in question. Nevertheless, we intend to investigate this issue in future work.

We propose that the semantics of ENs is more general than that of frequency and degree adverbs. Essentially, we assume that they are basically able to target totally ordered sets of an unspecified type. Since non-stative eventualities comprise time scales which share core properties with degree scales, ENs are, thus, able to modify both events involving duration and degree expressions such as comparatives and equatives. On the other hand, frequency expressions such as *often* can operate only on a specified scale, i.e., a time scale, whereas degree adverbs including DNs target a scale of degrees.

### **7.2.2 Comparatives and equatives**

Finally, let us discuss how ENs differ from DNs in equatives. Consider the examples in (6), repeated here as (48). We propose that in equatives ENs do not measure the gap between the degrees associated with the standard of comparison and the correlate as standard differentials. Instead, they multiply the degree associated with the standard.

(48) Petr Petr je is **dvakrát** twice vyšší taller než than / tak so vysoký tall jako how Marie. Marie 'Petr is two times taller than / as tall as Marie.'

We assume that in comparatives and equatives, ENs are simple operators of type ⟨*d*,*d*⟩. They take a degree and return a value multiplied by the number corresponding to the numeral root, see (49a) for the semantics of *dvakrát* and (49b)


Table 2: Scopal properties of adverbs of quantification

### 4 Event and degree numerals

for the generalized meaning of ENs. As a result, they are less sensitive to a particular structure of a phrase of comparison in which they can appear. We propose that within the DegP ENs pick CPs as their arguments. We hypothesize that their wider scope follows from that fact.


Such a semantics fits nicely both with comparatives and equatives. In (50), the EN adjoins to the CP denoting the maximal interval corresponding to the standard of comparison, i.e., Marie's maximal height, before the partitive preposition applies. The EN returns the maximal degree to which Marie is tall multiplied by two and it is not until then that *než* yields a set of degrees the maximal degree corresponding to Marie is part of. The resulting ⟨*d*,*t*⟩ expression is compatible with the input requirement of the comparative marker *-ší*.

Mojmír Dočekal & Marcin Wągiel

In the case of equatives, see (51), the composition proceeds in a parallel manner, the only difference being that there is no partitive preposition to shift the denotation of the CP to ⟨*d*,*t*⟩. As a result, the equative marker selects the degree provided by the outcome of the multiplication operation introduced by the EN.

Assuming pragmatic enrichment, as discussed in §7.1.1, in both cases we finally obtain the same truth conditions, as specified in (52). This corresponds to our intuition that both sentences are actually equivalent and would be judged true iff the maximal interval to which Petr is tall is equal to the maximal interval to which Marie is tall multiplied by two.

(52) <sup>J</sup>(48)<sup>K</sup> <sup>=</sup> MAX(*λd*[*µ*HEIGHT(Petr) ≥ *<sup>d</sup>*]) <sup>=</sup> <sup>2</sup> <sup>×</sup> *<sup>µ</sup>*HEIGHT(Marie)

The proposed analysis seems to derive the desirable truth conditions and explains different behavior of ENs and DNs in constructions of comparison. Though our approach does not answer the question why ENs can be used to both modify degrees and count eventualities, we would like to speculate that a possible explanation lies in their type requirement. ENs seem to be polymorphic operators whose both domain and range consists of expressions of a primitive type *d* or *v* which allows then to target free relatives of degrees as well as event-denoting clauses. However, this hypothesis requires careful consideration and we leave this issue for further investigation.

### **8 Conclusion**

In this paper, we have presented novel evidence from Czech concerning the distinction between two classes of adverbs of quantification, i.e., event numerals

such as *dvakrát* 'twice/two times' and degree numerals such as *dvojnásobně* 'doubly/twofold'. We have discussed their distribution and examined multiple contrasts in various environments including equatives and modification of count events. According to our proposal degree numerals denote properties of degrees, which explains their occurrence in predicate position as well as their ungrammaticality in equatives. On the other hand, event numerals have a more general semantics which results in wider scope as well as the ability to target both events and degrees. We have hypothesized that event numerals in comparatives and equatives behave as simple operators that yield a multiplied value of an input degree which allows for the compatibility with both comparatives and equatives. Furthermore, we have suggested a treatment for adjectival degree numerals such as *dvojnásobný* 'double/two-time'. Nevertheless, many questions remain open. The exact and systematic representation of the meaning of event and degree numerals poses a challenge for further research. It would be also exciting to pursue a cross-linguistic investigation to explore even more properties of the discussed alternation.

### **Abbreviations**


### **Acknowledgements**

We would like to thank two anonymous reviewers, the audience at the FDSL 12 conference, especially Berit Gehrke, Manfred Krifka, and Barbara Tomaszewicz, as well as Manfred Bierwisch, Daniel Büring, Pavel Caha, Kim Hoangová, Stephanie Solt, Viola Schmitt, and Markéta Ziková for their insightful remarks and inspiring comments. All errors are, of course, our own. We gratefully acknowledge that the research was supported by a Czech Science Foundation (GAČR) grant to the Department of Linguistics and Baltic Languages at the Masaryk University in Brno (GA17-16111S) as well as an Aktion Österreich-Tschechien scholarship awarded to Marcin Wągiel and financed by the Austrian Federal Ministry of Science, Research and Economy (ICM-2016-05748).

Mojmír Dočekal & Marcin Wągiel

### **References**


### 4 Event and degree numerals


### Mojmír Dočekal & Marcin Wągiel


### **Chapter 5**

## **A thought on the form and the substance of Russian vowel reduction**

### Guillaume Enguehard

Université d'Orléans, CNRS/LLL

This paper is an attempt to formalize the Russian vowel reduction within a substance-free approach. My contribution consists in arguing that Russian vowel reduction is a strict quantitative phenomenon (not a qualitative phenomenon). Finally, I propose a motivation based on the representation of stress in different autosegmental frameworks.

**Keywords:** Russian vowel reduction, phonology, substance-free, Element Theory

### **1 Introduction**

[…] our mission is closer to one of revelation than of perfection. (Hamilton 1980: 132)

Russian vowel reduction is known to be a complex mechanism showing strong variations both in the realization and the neutralization of vowel phonemes. This paper is a modest contribution to the understanding of this phenomenon. My aim is to stress the difference between the substance and the form of Russian vowel reduction. In the line of Hjelmslev (1943/1971), I will assume a clear separation between the realization of distinctive units (which I call substance or phonetics) and their abstract relations (which I call form or phonemics). Such a strong dichotomy was also recently renewed in Hale & Reiss (2000) and Dresher (2008) (among others). The aim of this paper is not to compete on the same field as very valuable studies addressing the realization of Russian unstressed vowels (e.g. Crosswhite 2000a,b; Padgett 2004; among others). These deal with phonetic realizations which are not central to the present paper. I rather propose a parallel

Guillaume Enguehard. 2018. A thought on the form and the substance of Russian vowel reduction. In Denisa Lenertová, Roland Meyer, Radek Šimík & Luka Szucsich (eds.), *Advances in formal Slavic linguistics 2016*, 109–125. Berlin: Language Science Press. DOI:10.5281/zenodo.2545515

### Guillaume Enguehard

– substance-free – approach suggesting that the form of Russian vowel reduction is more consistent than its phonetic realization. More specifically, I argue that Russian vowel reduction can be interpreted as a quantitative phenomenon motivated by a length distinction between stressed and unstressed syllables.

In §2, I introduce the various substantial and formal manifestations of Russian vowel reduction. In §3, I propose to analyze Russian vowel reduction as a quantitative – rather than qualitative – phenomenon. Finally, in §4, I suggest that this quantitative phenomenon can be motivated by the representation of stress in some autosegmental frameworks.

### **2 The variation of Russian vowel reduction**

Russian phonological inventory has five vowel phonemes in stressed syllables (1). Following Garde (1998: §19), I admit that [i] and [ɨ] are allophones of the same distinctive unit /i/: [ɨ] occurs after hard consonants (except velars) and [i] occurs elsewhere (Avanesov 1968: §8; Garde 1998: §95). The definition of vowel phonemes in terms of acoustic or articulatory features is not relevant for the substance-free approach advocated in this paper. For the time being, I simply define e.g. /i/ as a variable with relational properties <sup>¬</sup>/u/, <sup>¬</sup>/a/, <sup>¬</sup>/e/ and <sup>¬</sup>/o/.<sup>1</sup>

(1) Russian stressed vowels


The inventory in (1) undergoes a vowel reduction process in unstressed syllables. This process is manifested by (i) a phonetic difference between stressed and unstressed vowels and (ii) a neutralization of some phonological oppositions. Furthermore, both these substantial and formal aspects can vary according to the factors in (2).

	- b. segmental context
	- c. morphological context
	- d. dialectal context

<sup>1</sup> I use the negation symbol ¬ in order to represent oppositions: *x* = ¬*y* should be read as *x* ⊕ *y* or "*x* is right only if *y* is wrong."

### 5 A thought on the form and the substance of Russian vowel reduction

### **2.1 Phonological factors**

Russian vowel reduction is conditioned by two phonological factors: (i) the segmental context (after hard consonants vs. soft consonants vs. *š*, *ž* or *ts*) and (ii) the prosodic context (first pretonic syllable vs. non-pretonic syllables).<sup>2</sup> I base the following description on the Standard Russian variety depicted in Avanesov (1968) and Garde (1980/1998).

### **2.1.1 After hard consonants**

Russian vowel reduction after a hard consonant is illustrated in (3). In substantial (i.e. phonetic) terms, we observe a centralization of /a/ and /o/. The resulting vowel is realized as [ɐ] in the first pretonic syllable (3a) and [ə] in other pretonic syllables (3b) or in post-tonic syllables (3c).<sup>3</sup> Vowels /i/ and /u/ never reduce (Avanesov 1968: 38–42).<sup>4</sup>


In formal (i.e. phonemic) terms, the two centralization processes illustrated in (3a) and (3b/3c) result in the same neutralization of /a/ and /o/, represented by the merged box in (4). The place of /e/ (gray box) in this reorganization cannot be determined. Lexically, a stressed /e/ never occurs after a hard consonant (Garde 1998: §103). Even in (rare) loanwords, it never alternates with an unstressed vowel (e.g. *mér, mér-a, mér-u, mér-om, mér-e, mér-y,* etc. 'mayor'). Regardless the place of /e/, the Russian vowel inventory is reduced to three distinctive units in unstressed context.

<sup>2</sup>Soft consonants are palatal or palatalized consonants. Hard consonants are non-palatal or nonpalatalized consonants. Consonants *š*, *ž* and *ts* belong to a third category.

<sup>3</sup>However, a word-initial non pretonic /a/ or /o/ is unexpectedly realized as [ɐ] (e.g. [ɐ]tdavát' 'to give back'; see Avanesov 1968: §14), not [ə].

<sup>4</sup>Default grammatical information (such as nominative or singular) is not glossed.

### Guillaume Enguehard

(4) Vowel reduction after hard consonants (Standard Russian)


[u] a. after a hard consonant (except velars) b. after a velar and in initial position c. in first pretonic syllable d. in other unstressed syllables

### **2.1.2 After soft consonants**

Russian vowel reduction after a soft consonant is illustrated in (5). Substantially, /a/, /o/ and /e/ are fronted and raised to [i] in first pretonic (5a), in other pretonic syllables (5b), and in post-tonic syllables (5c).


Formally, the opposition between /a/, /e/, /o/ and /i/ is neutralized both in pretonic and non pretonic syllables (6). It results that the Russian vowel inventory is reduced to two distinctive units in this context.

(6) Reduced vowels after soft consonants (Standard Russian)

### 5 A thought on the form and the substance of Russian vowel reduction

### **2.1.3 After** *š, ž,* **and** *ts*

Russian vowel reduction after *š, ž,* and *ts* is represented in (7). Substantially, /a/ is centralized to [ɐ] in pretonic syllables (7a) and [ə] in non pretonic syllables (7b). As for /o/ and /e/, they are centralized to [ɨ] in pretonic syllables (7c) and [ə] in non pretonic syllables (7d).


Formally, the mechanisms observed in pretonic and non pretonic syllables are distinct. In pretonic syllables, a neutralization applies between /e/, /o/ and /i/ (8a). In non-pretonic syllables, a neutralization applies between /a/, /e/ and /o/ (8b). In both cases, it results that the Russian vowel inventory is reduced to three distinctive units.

	- a. Pretonic [ɨ] [u] [ɐ] b. Non-pretonic [ɨ] [u] [ə]

### **2.2 Morphological factors**

The reduction patterns observed in inflectional suffixes (only after soft consonants and *š*, *ž* or *ts*) differ from the generalizations of §2.1.2 and §2.1.3, both substantially and formally.

Substantially, /a/ and /o/ are centralized to [ə] after soft consonants (9a) and *š*, *ž* or *ts* (9b). As for /e/, it is raised to [i] after a soft consonant (9c), and [ɨ] after *š*, *ž* or *ts* (9d).

### Guillaume Enguehard


These reduction patterns have the same formal representation after a soft consonant and after *š*, *ž* or *ts* (10). A neutralization applies (i) between /i/ and /e/ and (ii) between /a/ and /o/. Again, it results that the Russian vowel inventory is reduced to three distinctive units in these contexts.

(10) Reduced vowels in inflectional suffixes

[u] a. after a soft consonant [ə] b. after *š, ž,* or *ts*

### **2.3 Dialectal factors**

The reduction patterns described above concern the Standard Russian variety and are not shared by all dialects. In what follows, I give a brief overview of relevant dialectal features concerning the phonology of unstressed vowels.

Concerning the phonology of unstressed vowels after a hard consonant, Russian dialects can be divided into three groups: dialects with Akanye (Avanesov 1949: §47), dialects with Okanye (Avanesov 1949: §42), and dialects with Ukanye (Avanesov 1949: §43); see (11). I do not discuss subtypes such as varieties with Dissimilative Akanye (Avanesov 1949: §49) and Mixed Okanye-Akanye (Avanesov 1949: §46).

	- a. Akanye: neutralization of /a/ and /o/
	- b. Okanye: no neutralization of /a/ and /o/
	- c. Ukanye: neutralization of /o/ and /u/

Concerning the phonology of unstressed vowels after a soft consonant, Russian dialects can be divided into three other groups: dialects with Yakanye (Avanesov 1949: §60), dialects with Okanye (Avanesov 1949: §56), and dialects with Ikanye

### 5 A thought on the form and the substance of Russian vowel reduction

(Avanesov 1949: §59); see (12). Subtypes such as varieties with Ekanye (Avanesov 1949: §57) or Dissimilative Yakanye (Avanesov 1949: §64) are not relevant to this paper.

	- a. Yakanye: neutralization of /a/, /o/, and /e/
	- b. Okanye: no neutralization of /a/, /e/, and /o/
	- c. Ikanye: neutralization of /a/, /o/, /e/, and /i/

A schematic geographical distribution of the dialectal features in (11) and (12) are represented in Figure 1 (source: Bukrinskaja et al. 1994).

Figure 1: Distribution of dialectal variants of Russian vowel reduction

Dialects with proper Okanye have no vowel reduction (some subtypes can show some neutralizations in specific segmental contexts; see Avanesov 1949: §46, §56). Akanye and Ikanye refer to the reduction patterns illustrated in (4) and (6) respectively. In what follows, I address the remaining Ukanye and Yakanye patterns.

### **2.3.1 Ukanye**

Substantially, Ukanye is manifested by a raising of /o/ in both first pretonic syllables (13a) and other pretonic syllables (13b). A raising of /o/ can also be found in post-tonic syllables of several central and southern dialects (Avanesov 1949: §104) or in the Kamtchatka dialect (see Gluschenko 2007: 40).

### Guillaume Enguehard


Formally, this raising results in a reorganization of the vowel inventory into three distinctive units, due to the neutralization of the opposition between /o/ and /u/.

(14) Ukanye


### **2.3.2 Yakanye**

Yakanye is substantially manifested by a lowering of /o/ and /e/ after a soft consonant in pretonic syllables (15). Such a lowering of /o/ and /e/ can also be found in other pretonic syllables (see Avanesov 1949: §96) and in post-tonic syllables (Avanesov 1949: §108–112) of several central and southern dialects.<sup>5</sup>


Formally, this lowering also results in a reorganization of the vowel inventory into three distinctive units, due to the neutralization of the opposition between /a/, /o/, and /e/.

(16) Yakanye

<sup>5</sup>The variation of post-tonic vowels is known to be very complex due to, e.g., morphological factors (Avanesov 1949: §107). Thus, it is only mentioned here.

### 5 A thought on the form and the substance of Russian vowel reduction

### **2.4 Summary**

To conclude this section, we saw that the Russian vowel inventory is reduced to three distinctive units in unstressed syllables (except after a soft consonant in dialects with Ikanye; see (6)). These three distinctive units are represented with /A/, /I/ and /U/ in (17).<sup>6</sup>

(17) Russian unstressed vowels


Now, if we assume that distinctive units are exclusively defined by a set of abstract relational properties, then the three distinctive units found in unstressed context (17) should not be assimilated to a subset of the five distinctive units found in stressed context (1). Each distinctive unit of the stressed context is defined by a set of oppositions to four other units (e.g. /i/ = ¬/u/, ¬/a/, ¬/e/, ¬/o/). But each distinctive unit found in unstressed context (17) is defined by a set of oppositions to two other units only (e.g. /I/ = ¬/U/, ¬/A/). In that sense, /I/, /A/, and /U/ are less specified than /i/, /e/, /a/, /o/, or /u/. They thus represent archiphonemes. This notion will be discussed and defined below.

I suggest that the main formal aspect of Russian vowel reduction lies in this *underspecification* of vowel phonemes, not in the realizations that result from this underspecification.

### **3 Formal representation of Russian vowel reduction**

In a substance-free approach, it could be tempting to interpret Russian vowel reduction as a simple redistribution of the five vowel phonemes into a reduced ternary inventory. Following such a hypothesis, every stressed vowel could freely alternate with every archiphoneme of the unstressed context. But this is not the case.

Table 1 outlines the various alternations between stressed vowels and their underspecified counterparts in unstressed syllables. It can be observed that these alternations are constrained: e.g., /i/ and /u/ never alternate with the same archiphoneme. In order to formalize this constraint, we need to distinguish the behaviors of vowel phonemes by referring to their respective properties.

<sup>6</sup>These symbols do not correspond to phonetic properties. They could be represented with features |A|, |B|, and |C|, or |X|, |Y|, and |Z|, etc.

### Guillaume Enguehard


Table 1: Alternation between stressed vowels and their underspecified counterparts

I propose to determine the formal properties of vowel phonemes based on the definition of archiphonemes in (18). According to this definition, two phonemes can alternate with the same archiphoneme iff they share a relevant feature. Thus, if /i/ and /u/ never alternate with the same archiphoneme, we can suppose that they do not share any relevant feature. The issue is that /i/ and /u/ seem to share some distinctive properties. Substantially, /i/ and /u/ are [+high]. Formally, they share relational properties such as ¬/a/, ¬/e/, and ¬/o/.

(18) Definition of the archiphoneme (Akamatsu 1988: 201) The archiphoneme is a distinctive unit whose phonological content is identical with the relevant features common to the member phonemes of a neutralizable opposition, which is distinct from any of these member phonemes and which occurs in the position of neutralization.

One possible solution is to assume that the relational properties of phonemes are primitively organized into indivisible sets (e.g. {¬/a/, ¬/u/, ¬/o/, ¬/e/} for /i/ and {¬/a/, ¬/i/, ¬/o/, ¬/e/} for /u/). In this trivial example, /i/ and /u/ do not share any property. Such a representation of distinctive features by means of complex sets is defended in several models like, e.g., Particle Phonology (Schane 1984) or Element Theory (Kaye et al. 1985). Element Theory assumes that distinctive features are organized into complex properties represented by |A|, |I| and |U|.<sup>7</sup> Each vowel can be defined by one or several of these properties.

<sup>7</sup>A substance-free reinterpretation of these features could be {¬|I|, <sup>¬</sup>|U|}, {¬|A|, <sup>¬</sup>|U|} and {¬|I|, ¬|A|} respectively.

### 5 A thought on the form and the substance of Russian vowel reduction

Thus, based on the alternations in Table 1 (sketched in Figure 2) and the definition of archiphonemes in (18), it is now possible to determine the underlying representation of each stressed vowel in terms of abstract features |A|, |I|, and |U|, representing the indivisible properties of the three archiphonemes found in unstressed syllables.<sup>8</sup>

Figure 2: Outline of the neutralizable relations between vowel phonemes

First, we saw that the vowels /i/ and /u/ never alternate with the same archiphoneme. Thus it can be supposed that they do not share any property. For convenience, I represent the distinct properties of /i/ and /u/ with |I| and |U| respectively. Second, /e/ and /i/ can alternate with the same archiphoneme (Types B, C, E). Thus /e/ contains |I|. Third, /o/ and /u/ can alternate with the same archiphoneme (Type F). Thus /o/ contains |U|. Fourth, /e/, /o/ and /a/ can altogether alternate with an archiphoneme opposed to /I/ and /U/ (Types A, D). Thus they all share a property that is not |I| nor |U|. I represent this property with |A|. The resulting representation of stressed vowel is outlined in (19).

(19) Representation of Russian vowels (preliminary version)


One can observe that /o/ and /a/ can also alternate with /i/ after a soft consonant, *š*, *ž* and *ts* (Types B, C). Accordingly, we should assume that they both contain an |I|. However, this leads to an important issue: in this case, both /e/ and /a/ would be defined by |IA|. Fortunately, it can be argued that the |I| found in this context is inherited from the preceding consonants. From the substantial point of view, these are palatal or palatalized segments that trigger a fronting and a

<sup>8</sup>Types B and C are not taken into account in this outline. They will be discussed below.

### Guillaume Enguehard

raising of /o/ and /a/ via assimilation. From the formal point of view, it is more difficult to argue that soft consonants share an |I| feature with the archiphoneme /I/. Nevertheless, I already mentioned that /e/ is lexically found after soft consonants only (see §2.1.1). In other words, /e/ contains a feature that neutralizes the opposition between hard and soft consonants. This feature can be |I| or |A|; see (19). The vowel /a/ can be indistinctly preceded by a soft or a hard consonant. However, there is a correlation between /i/ and the hard/soft contrast: e.g., the verbal suffix *-i* always triggers a softening of the preceding consonant (e.g., *bro*[s]*-át'* 'throw.ipfv' vs. *bro*[sʲ]*-ít'* 'throw.pfv'). Thus, it can be supposed that the opposition between these two consonant classes is due to the presence of a property shared by /e/ and /i/, namely |I|.

The representation in (19) raises another issue concerning the definition of archiphonemes. Following the definition (18), the result of a neutralization process is a *new* phonological item defined by the set of relations common to the member phonemes of a neutralized opposition. In this respect, |A|, which is the representation of the archiphoneme /A/, cannot be the representation of the fully specified vowel /a/. Indeed, the phoneme /a/ has something more than the archiphoneme /A/: it contrasts with /e/ and /o/. This property should be represented by an additional feature in /a/ distinguishing it from /A/. The only possible feature, distinct from |I| in /e/, |U| in /o/ and *zero* in /A/, is the |A| feature. Thus, if we want to represent the formal distinction between /a/ and /A/, we should assume that /a/ is a complex vowel made of two |A| features. Such a repetition of distinctive properties in the structure of vowels was already proposed in the Particle Theory of Schane (1984), developed in Carvalho (1993; 1994). Extending the same reasoning to the representation of /i/ and /u/, I now assume the representation of stressed and unstressed vowels in (20a) and (20b), respectively.

	- a. Stressed vowels

b. Unstressed vowels



Following this representation, Russian vowel reduction can be interpreted as a quantitative phenomenon. Stressed vowel have more distinctive properties than unstressed vowels. I propose to represent this distinction with the rule in (21a). This rule purposefully does not refer to the quality of distinctive properties. Indeed, such an ambiguity is likely to derive variations. According to this mechanism, /e/ can be reduced to |A| *or* |I|, and /o/ can be reduced |A| *or* |U|. This

### 5 A thought on the form and the substance of Russian vowel reduction

parametrical choice depends on the phonological, morphological and dialectal contexts. In order to account for the regularity of this choice in a given language variety, I propose the principle in (21b). The term "configuration" refers to (i) the representation of a given segment and (ii) both its phonological and morphological contexts. As an example, a pretonic /a/ and a non-pretonic /a/ represent two distinct configurations. They may or may not have two different interpretations.

	- a. Unstressed vowels lose a distinctive property.
	- b. For a given speaker, a configuration A always has an interpretation B.

The principle in (21a) concerns the form of distinctive units, and the principle in (21b) concerns the substance of distinctive units. Following Hjelmslev (1936) (here cited from Hjelmslev 1973), these principles refer to different components of the language: "phonematics" and "phonology":

One and the same phonematic system may be pronounced by means of very different phonological systems. (Hjelmslev 1973: 159)

The contribution of this analysis consisted in suggesting that Russian vowel reduction can be analyzed as a strict quantitative phenomenon if we do not refer to substance. In the following section, I suggest that this quantitative phenomenon can be motivated by the representation of stress.

### **4 Motivation of Russian vowel reduction**

In the previous section, we saw that the formal representation of Russian vowel reduction is strictly a matter of complexity (i.e., quantity of information). But one can ask: Why should an unstressed vowel have less distinctive properties than a stressed vowel?

Interestingly, Russian vowel reduction is related to another quantitative phenomenon conditioned by stress: Russian stressed vowels are phonetically longer than unstressed vowels (Zlatoustova 1953; Vysotskij 1973; Al'muhamedova & Kul'sharipova 1980: 47; Svetozarova 1982: 155–158; Crosswhite 2000a: 5–7; Crosswhite 2000b: 116–117; Knjazev 2006: 43). Such a correlation between stress and vowel length can be observed in several languages, and it was represented with an extra time unit provided by stress in Chierchia (1986), Larsen (1998), Ségéral & Scheer (2008), Crosswhite (2000a,b), Bucci (2013), and Enguehard (2016), among

### Guillaume Enguehard

others. In what follows, I represent this extra unit with an x-slot on the right of the stressed nucleus (22).<sup>9</sup>

```
(22) d[ˈɔ]m 'house'
x x [x] x
   stress
d o m
```
The relation between vowel length and vowel reduction was already formalized in Crosswhite (2000a,b). Crosswhite proposed that the sonority of vowels is conditioned by the presence of a mora in stressed and pretonic syllables. Here, I propose to take the additional step of unifying length and Russian vowel reduction with the following generalization: the amount of realized vowel features is proportional to the amount of available skeletal slots. If a vowel stands in a stressed position, it has two available slots and all its distinctive properties are manifested (23a). If a vowel stands in an unstressed position, it has one available slot and only one of its distinctive properties can be manifested (23b). It turns out that Russian mid vowels are abstractly represented as sorts of diphthongs.<sup>10</sup>


Note that the distinctive property that is manifested in (23b) is not necessarily |A|. In a dialect with Ukanye, the realized property is |U| (e.g., *d*[ɐ]*mój* vs. *d*[u]*mój* 'home').

<sup>9</sup>One could object that length belongs to the substance while stress belongs to the form. Alternatively, Enguehard (2016) proposed that the relation between them could be inverted: stress is a possible substantial realization of length.

<sup>10</sup>Note that this would be an issue if Russian had *real* diphthongs. But it does not have any.

5 A thought on the form and the substance of Russian vowel reduction

### **5 Conclusion**

As a conclusion, this paper is an attempt to formalize Russian vowel reduction without referring to substance. I suggested that Russian vowel reduction does not handle the quality of vowel phonemes, but the quantity of their distinctive properties. Then, I proposed that the quantity of distinctive properties is conditioned by the quantity of skeletal slots. In that sense, Russian qualitative distinction between stressed and unstressed syllables is not very different from length distinctions observed in languages like Italian (see Parmenter & Carman 1932). Such a generalization supposes an interesting convergence (for further studies) between paradigmatic and syntagmatic axes.

### **Abbreviations**


### **References**


### Guillaume Enguehard


5 A thought on the form and the substance of Russian vowel reduction


### **Chapter 6**

## **The Russian perfective present in performative utterances**

Anja Gattnar University of Tübingen

Johanna Heininger University of Tübingen

### Robin Hörnig

University of Tübingen

This paper aims to show that perfective verbs in Russian can – contrary to common sense – be used in performative utterances without lacking the performative meaning of the sentences. In Russian, performative utterances are generally built with an imperfective (ipv) verb in present tense, first person singular or plural. According to the Slavistic literature, the perfective (pv) verb is at most used in marked contexts and with a few selected performative verbs. In our contribution, we will show experimentally that the use of present perfective verbs in performative utterances is considerably more widespread than supposed so far. In two experiments, Russian native speakers located events in time, providing evidence, first, for the temporal interpretation of the sentence depending on the verbal aspect, and second, concerning whether the temporal interpretation differs depending on how much context is given.

**Keywords:** Russian, verbal aspect, speech act, performative verbs, interpretation, experimental evidence

Anja Gattnar, Johanna Heininger & Robin Hörnig. 2018. The Russian perfective present in performative utterances. In Denisa Lenertová, Roland Meyer, Radek Šimík & Luka Szucsich (eds.), *Advances in formal Slavic linguistics 2016*, 127–146. Berlin: Language Science Press. DOI:10.5281/zenodo.2545517

Anja Gattnar, Johanna Heininger & Robin Hörnig

### **1 Introduction**

### **1.1 General remarks**

Aspect use in performative utterances in Russian is the core issue of the present paper. We adopt the terminology of Eckardt (2012) and define a performative utterance as a sentence that is used to issue a speech act by applying a speech act verb. Since the present tense of the verb is a precondition for a performative utterance, the ipv verbal aspect is preferred in Russian. However, the Slavistic research literature describes cases where a performative speech act is expressed by a pv verb. This is interesting, because the pv aspect is thought of being unable to appear in present tense. Example (1a) shows a sentence expressing an ordinary correct performative speech act, whereas the corresponding version (1b) with pv *predložu* is unacceptable. Example (2) demonstrates the same mismatch with another speech act verb:

	- b. \* Predložu propose.pv otpravit'sja go domoj. home Intended: 'I propose to go home.'
	- a. kljanjus' swear.ipv ot from čistogo pure serdca. heart 'I swear with all my heart.'
	- b. \* pokljanus' swear.pv ot from čistogo pure serdca. heart Intended: 'I swear with all my heart.'

Different from the verbs in (1) and (2), there are other speech act verbs allowing pv aspect, as in (3):

	- b. Ja I poprošu ask.pv vas you govorit' speak gromko loud i and po by očeredi. order 'I ask you to speak loudly and one by one.'

### 6 The Russian perfective present in performative utterances

Dickey (2000), for example, has noticed that for some speech act verbs in performative utterances both ipv and pv aspect can be used. Thus, his study is limited to some particular verbs like the pv *verba dicendi skazatʾ* ʿto tellʾ, *priznatʾsja* ʿto confessʾ, *zametitʾ* ʿto noteʾ, *pribavitʾ* ʿto addʾ, *poprositʾ* ʿto ask forʾ, *povtoritʾ* ʿto repeatʾ, *doložitʾ* ʿto reportʾ (Dickey 2000: 179). In his opinion, some pv *verba dicendi* are not allowed, like *predložitʾ* ʿto proposeʾ and *pokljastʾsja* ʿto swearʾ; see (1) and (2).

We want to show that pv speech act verbs can perform performative utterances to a larger extent than previously expected. We do not assume that the pv and ipv performative utterances are used interchangeably. In our opinion, a pv speech act verb has an influence on the pragmatic interpretation of the speech act. We will not investigate interpretation differences in depth in this paper, but rather we want to experimentally establish that both aspects can indeed be used to utter a performative speech act.

In the following, we give a short overview of the Russian aspectual system (§1.2). Afterwards we explain the peculiarities of performative speech acts and how aspect use is related to it (§1.3). Then, the phenomenon of the present perfective is described, which has been intensively studied in the Slavistic literature (§1.4). Subsequently, we discuss the present perfective in performative speech acts and present the relevant literature on Russian performatives (§1.5). These theoretical issues are followed by the presentation of two experiments that we have conducted in St. Petersburg in 2016 (§2). Finally, we discuss our results and give an outlook for future research (§3).

### **1.2 The Russian aspectual system and tense**

In Russian, aspect is a grammaticalized category. Nearly every Russian verb has two aspects that are morphologically distinguished and differ in grammatical function: the imperfective aspect (ipv) and the perfective aspect (pv). These verb pairs are derived by prefixes or suffixes: *pisat'* 'to write.ipv' and *napisat'* 'to write.pv'; *otkryt'* 'to open.pv' and *otkryvat'* 'to open.ipv'.<sup>1</sup> The ipv aspect is used for (i) habitual or iterated actions, (ii) single, incomplete actions in progress, and (iii) actions which do not emphasize the result. The pv aspect is used (i) for single,

<sup>1</sup>Other verb pairs are opposed by suffix only: *kričat' – kriknut'* 'to cry' or by suppletion *brat' – vzjat'* 'to take'. A smaller group of verbs do not form pairs: (i) biaspectual verbs: *kaznit'* 'to punish', (ii) *imperfectiva tantum*: *sidet'* 'to sit', and (iii) *perfectiva tantum*: *rinut'sja* 'to pounce on'.

### Anja Gattnar, Johanna Heininger & Robin Hörnig

completed actions or (ii) ongoing actions intended to be completed.<sup>2</sup>

Morphosyntactically there exist only three tense categories: preterite, present, and future. Not all three categories are represented in both ipv and pv aspect. Whereas ipv verbs conceptualize all three tense categories, pv verbs appear only in preterite and future, because present tense is not compatible with the concept of completeness. The lack of present tense marking for pv verbs plays a key role in our investigation. Table 1, a simplified version of Swan (1978), summarizes the (semantic) categories resulting from crossing aspect with tense in Russian.

Table 1: Tense and aspect in Russian


### **1.3 Performatives**

A speech act is called performative when the utterance and the action named by a speech act verb take place simultaneously. The utterance is part of the action (Austin 1962) and performs it. Performative utterances are not statements that are true or false, but concrete, unique actions. In Russian, by default, performatives are expressed with the ipv aspect present, first person; see examples (4)–(6).


<sup>2</sup>There is a huge range of works on verbal aspect and its meaning to which we cannot refer in this paper. Therefore we limited our selection to pure Slavistic or Russian works that are generally accepted among Slavists and in Russian aspectology: Anstatt (2003); Avilova (1976); Bondarko (1971); Breu (1980; 2000); Comrie (1976); Dickey (2000); Galton (1976); Klein (1995); Lehmann (1999); Maslov (1984); Mehlig (1981); Padučeva (1996); Petruchina (2000); Rassudova (1982); Zaliznjak & Šmelev (2000); etc.

### 6 The Russian perfective present in performative utterances

(6) Ja I očen' very žaleju, apologize.ipv čto that my we ne not vstretilis' meet s with vami. you 'I deeply apologize, that we didn´t meet you.'

We share the opinion with Apresjan (1988), Padučeva (1994), and Petruchina (2000) that performative verbs in Russian can only express a punctual event and not a process. They do not describe an ongoing event, because the action expressed by the verb is accomplished once the speaker finishes the utterance (Petruchina 2000). Therefore, it would be incorrect to translate one of the above examples, for instance (4), with the English continuous form: *\*I am promising you to go to grandmother.*<sup>3</sup>

As pv cannot get a present tense marking in Russian, we would expect that only the ipv speech act verbs can be used in performative speech acts. However, we have found examples with pv speech act verbs in performative speech acts, as in (7a) from the Russian National Corpus (RNC):<sup>4</sup>

	- b. Pozdravljaem congratulate.ipv.1pl že ptcl našich our peredovikov labor.activists i and zaodno simultaneously prezidenta president s with neverojatnym amazing uspechom! success 'We congratulate our labor activists and also the president for the amazing success.'

In (7a) the pv speech act verb *pozdravim* 'congratulate.pv.1pl' is used to perform a speech act. In (7b) we replaced the pv verb of the original sentence with the corresponding ipv verb *pozdravljaem* 'congratulate.ipv.1pl'. (7b) is a properly built performative sentence with the ipv verb meeting all three conditions for a successful performative speech act: speech act verb, first person, and present tense. We find it plausible to assume that (7a) expresses a performative speech act, too.

It is interesting for us whether a pv speech act verb changes the sentence meaning compared to the corresponding ipv verb, for instance with respect to our

<sup>3</sup>Harnish (2007) discusses the English present progressive in performatives and shows that performative utterances favor the simple present.

<sup>4</sup> Interestingly, pv speech act verbs systematically fail the 'hereby'-test, which is only feasible with ipv verbs: *S ėtim ja prošu[*ipv*] vas govorit' gromko.* 'Hereby I ask you to speak loudly' vs. *\*S ėtim ja poprošu[*pv*] vas govorit' gromko* (Eckardt 2012).

### Anja Gattnar, Johanna Heininger & Robin Hörnig

variants (7a) versus (7b). The occurrence of present perfective speech act verbs is documented in many works, but we don't know of any experimental investigation addressing the interpretation of performative utterances as a function of the verb aspect. Are utterances with pv speech act verbs actually understood as performative speech acts? If yes, what does this imply for the temporal localization of the event denoted by the pv speech act verb? In our study we presuppose that the localization of an event denoted by a speech act verb in the present indicates a performative interpretation. We feel confident that sentences with pv speech act verbs are performative utterances only in the case that they express an event that proceeds simultaneously with the utterance time. This is only possible, when the pv speech act verb is interpreted as present perfective.

In the next section we will present arguments for a pv in performative utterances in Russian and invoke the debate on the present perfective.

### **1.4 The present perfective in Russian**

The debate on the present perfective started with Koschmieder (1929). He declares, initially only for Polish, that present perfective is possible in non-future meaning solely when the action time coincides with the utterance time and when the verb is in first person form. Forsyth (1970: 120) even claims: "Their use in non-future meanings, however, is extremely common and not on the least exceptional." Švedova (1980) supports this view and notes that under certain syntactic conditions the pv verb can denote actions that take place in the present and not in the future with nuances of meaning. Rathmayr (1976) goes even further. She is of the opinion that the present perfective is equal to ipv present plus some stylistic function; yet the stylistic properties are difficult to identify: Even if they are identified by a survey of native speakers they are anticipated to strongly diverge.

Dickey (2000), as before him Bondarko (1971) and Galton (1976), calls the phenomenon of present perfective "the temporal coincidence of a situation that is referred to a pv present form in the moment of the utterances". The present perfective does not refer to the future but to the time of utterance and, simultaneously, to the time at which the action denoted by the pv verb takes place. De Wit (2017) dubs the phenomenon differently, "the present perfective paradox", because the meaning of the temporal localization that belongs to the pv aspect should prevent the use of present perfective in Russian. Additionally, the occurrence of present perfective in Russian is explained in terms of the aspectual function of the pv aspect. De Wit (2017) agrees with Breu (2000) who notices that the aspectual meaning of the present perfective is stronger than the temporal meaning. In present perfectives, the aspectual meaning should be stronger than the

### 6 The Russian perfective present in performative utterances

temporal meaning of the aspect, because the meaning of temporal localization that is expressed by the pv aspect would prevent the use of present perfective. We will discuss this view at the end of the paper.

So far we have argued for the availability of a present perfective in Russian. But it still remains open, however, what kind of influence perfective present has in contexts where it substitutes the ipv.

### **1.5 The range of present perfective in performatives**

Like others, we accept the present perfective as means of expressions with the above mentioned readings. We argue that the high acceptability of present perfective implies that pv speech act verbs are able to fulfill a performative speech act. This is a purely theoretical assumption and based on the mentioned theoretical works, empirically supported only in a few cases by way of corpus data (Łaziński 2014; Wiemer 2014). Before we present our experimental work, it is necessary to mention some aspects concerning the type of speech act verbs that are used in ipv and pv as well as to give possible conceptual differences between the use of ipv and pv speech act verbs that are offered in the research literature.

In the Slavistic research literature several works attest the occurrence of pv speech act verbs in performative speech acts. But the use of pv verbs, according to these works, is limited to special verb types. For example, Rjabceva (1992) and Dickey (2000) claim that only a few pv *verba dicendi* can be used in competition with ipv performatives. Only for those verbs the pv verb may be used and only those pv verbs may perform a performative utterance. Contrary to Rjabceva and Dickey, Wiemer shows that the use of pv speech act verbs is also possible for some social performatives like request, desire, thanks, refusal and approval, see example (8). Łaziński (2014) agrees with Wiemer and demonstrates similar corpus data for Polish, Czech and Slovak.

(8) Ispol'zuja using ėti this sposoby, methods, uverju assure.pv čto that vam you.dat budet will.be legko easily zanimat'sja study russkim Russian jazykom. language 'When you use this methods, I assure that you will easily learn Russian.' (Wiemer 2014: 107)

The corpus findings of Wiemer and Łaziński lead us to the question, whether the range of pv verbs in performative utterances is wider than Rjabceva and Dickey assume. Some more detailed consideration is given by Israeli (1996; 2001). She

### Anja Gattnar, Johanna Heininger & Robin Hörnig

classifies speech act verbs into three groups depending on the verbal aspect that a speech act verb can take to perform a speech act (Israeli 2001: 84): (i) verbs performing a speech act only with ipv (see (1) and (2)), for example *prikazyvat'* 'to order.ipv', *trebovat'* 'to demand.ipv', *blagodarit'* 'to thank.ipv', *pozdravljat'* 'to congratulate.ipv' etc., often the ipv speech act verb has an iterative meaning; (ii) verbs performing a speech act both with ipv and pv (see (3)), for example *prosit'/poprosit'* 'to request.ipv/.pv', *sovetovat'/posovetovat'* 'to advise.ipv/.pv', želat'/poželat' 'to wish.ipv/.pv', etc.; (iii) verbs performing a speech act only with pv; in the latter case, the verb functions as structuring element, like *perejdëm k novoj teme* 'let's open.pv a new future topic', *otmetim* 'we note.pv', *zametim* 'we mention.pv' etc. (see example (10)).


According to Israeli (2001), ipv and pv performative utterances of the second group cannot be used interchangeably. This makes the aspectual competition particularly interesting for us. Although the alleged semantic or pragmatic differences in the interpretation of ipv and pv speech act verbs are not the central issue of this paper, we would like to shortly address Israeli's account. Whereas we believe that her account provides a promising perspective for future investigation, the first task accomplished here is to provide evidence that pv verbs actually can be used in carrying out a performative speech act.

Israeli argues that ipv and pv speech act verbs differs with respect to authority marking in performative utterances. A typical example for the authority marking in performative utterances in her sense is seen in (11a) from the oral corpus of the RNC. According to Israeli, the sentence shows, in comparison with (11b), how the different aspect use can influence the speaker's position of authority:


### 6 The Russian perfective present in performative utterances

The use of the pv verb *poprosit'* 'to ask.pv for' in (11a) can be connected with the communicative situation. The pv speech act verb stresses the authority of the teacher [+authority] towards the student. In (11b) the ipv verb *prosit'* 'to ask.ipv for' is used in a communication between a young man and a museum attendant. We might argue with Israeli that the ipv verb in (11b) is pragmatically neutral or even a polite request.<sup>5</sup> In §2 we will now present our two experiments that give evidence that sentences with a pv speech act verb are interpreted as present tense utterances.

### **2 Experimental evidence**

We have provided instances of present perfective in performative speech acts in Russian from the literature as well as from the RNC. The interpretation of the pv speech acts has not yet been demonstrated experimentally. Rathmayr (1976) mentions that she has asked four (sic!) informants and every one of them has given her another interpretation. Others work with their own intuition or support their arguments by presenting examples from corpus investigation (Wiemer 2014; Łaziński 2014). The main concern is to study if sentences like (11a) are interpreted as performative speech acts or not. In (11a) the verb has the grammatical form 1sg pv aspect. The additional meaning that refers to the aspect function of pv aspect would be 'will ask for'. In future meaning the sentence is not a performative speech act but a statement about an event in the future: In the future there will be a situation in which I am saying *I ask you to speak loudly and one by one.* Our aim is now to investigate the temporal alignment of pv performative verbs in morphological present.

Our assumption is that in performative context the use of the present perfective is becoming more widespread than it is reflected in the literature so far. We even tend to assume that every pv speech act verb can principally be used to execute a performative speech act. Our experiments reported below compare the temporal interpretation of speech act verbs with perfective versus imperfective aspect: Are pv speech act verbs never or reliably less often interpreted as present tense ipv speech act verbs? We assume that:

<sup>5</sup>The examples (11a) and (11b) do not only differ in aspect use. In addition, the [+authority] marked utterance (11a) has an overt subject *ja* 'I' whereas in (11b) there is a null subject. We also agree with one of the reviewers that the sentences improve with overt subject. Our own corpus investigation leads us to the assumption that an overt subject encourages the [+authority] marker. We did not yet test sentences with overt subjects experimentally, but consider it a future task to do so.

### Anja Gattnar, Johanna Heininger & Robin Hörnig


The two experiments that are presented in this section test our hypothesis that pv speech act verbs used in performative utterances may substitute ipv verbs.

### **2.1 Method**

### **2.1.1 Participants**

41 native speakers of Russian participated in Experiment 1 without Stop-reading (as explained in §2.1.3 below), a different sample of 40 Russian native speakers took part in Experiment 2 with Stop-reading. All participants were students of Saint Petersburg State University. They were paid 10 e for their participation.

### **2.1.2 Material**

20 verbs were selected from a pool of 28 speech act verbs, based on acceptability scores gathered in a web-based pilot study:<sup>6</sup> *uverit' / uverjat'* 'to assure sth. to so', *izvinit'sja / izvinjat'sja* 'to apologize for sth.', *poprosit' / prosit'* 'to ask for sth.', *potrebovat' / trebovat'* 'to demand sth. from so', *poželat' / ženat'* 'to wish sth. to so.', *poblagodarit' / blagodarit'* 'to thank so. for sth.', *priznat'sja / priznavat'ja* 'to admit sth. to so.', *priglašat' / priglasit'* 'to invite so. to sth.', *razrešit' / razrešat'* 'to allow so. to do sth.', *objazyvat'sja / objazat'sja* 'to commit oneself to sth.', *pochvalit' / chvalit'* 'to praise so. for sth.', *predupredit' / predupreždat'* 'to warn so. of sth.', *predstavit' / predstavljat'* 'to introduce so. to so.', *poprivetstvovat' / privetstvovat'* 'to welcome so.', *priznat' / priznavat'* 'to recognize so. as so.', *prikazat' / prikazyvat'* 'to order so. to do sth.', *otklonit' / otklonjat'* 'to reject sth.',

<sup>6</sup> 43 Russian native speakers judged performatives containing the verbs without preceding context on a scale from 0 to 6 (= most acceptable); mean acceptabilities of the 20 selected verbs were 3.7 (SD 0.96) and 1.7 (SD 0.96) for performatives with ipv and pv verb aspect, respectively.

### 6 The Russian perfective present in performative utterances

*pozdravit' / pozdravljat'* 'to congratulate so. for sth.', *prostit' / proščat'* 'to forgive sth. to so.', *otkazat' / otkazyvat'* 'to refuse sth. to so.'

Two variants of a performative target sentence (in short: performative) were constructed for each verb. The variants differed only in the aspect of the sentence initial verb which was either imperfective or perfective present in the first person singular, exemplified in (12a) and (12b). Both performative variants were preceded by the same context consisting of two or three sentences.<sup>7</sup>

	- a. Uverjaju assure.ipv.pres.1sg Vas, you čto that ėta this dolžnost' position – važnyj great šag step na on puti way k to uspechu. success
	- b. Uverju assure.pv.pres.1sg Vas, you čto that ėta this dolžnost' position – važnyj great šag step na on puti way k to uspechu. success

'I assure you, that this position is a great step towards success.'

In addition to the performatives, two variants of non-performative, declarative target sentences (in short: declaratives) were constructed for each of the twenty verbs, serving as control items. Again, the target variants differed only in the aspect of the verb which was either ipv or pv past in the third person singular, as exemplified in (13a) and (13b). Both declarative variants were preceded by the same context which differed from the one of the performatives.

	- a. Vrač doctor uverjal assure.ipv.past.3sg ich them v at tom, that, čto that oni they vse all skoro soon vyzdorovejut. will.recover

<sup>7</sup>The complete list of stimuli can be found here: http://hdl.handle.net/11022/0000-0007-CB0A-A@Appendix.pdf.

b. Vrač doctor uveril assure.pv.past.3sg ich them v at tom, that, čto that oni they vse all skoro soon vyzdorovejut. will.recover

'The doctor assured them, that they will all recover soon.'

In addition to the performatives and the controls, 40 fillers were added to the material. The two variants of the performatives and the controls were assigned to two lists such that each item variant was assigned to one of the lists and either list contained 10 performatives and 10 controls with ipv and pv aspect. About the same number of participants was tested with either list, hence all participants worked on a set of 80 items consisting of a context followed by a target.

### **2.1.3 Procedure**

Participants were tested separately in a quiet room at the Laboratory of Cognitive Studies at the State University of Saint Petersburg. Participants were randomly assigned to Experiment 1 without Stop-Reading or Experiment 2 with Stop-Reading. Participants were seated in front of a PC and instructed about the task to be performed. In each experiment participants worked on three practice trials to get familiar with the procedure before they moved on to the experimental block of trials.

In Experiment 1 without Stop-Reading, a trial began with a full presentation of the context. Participants read the context until they understood what happened and then pressed the space bar. Now the context was replaced by the target sentence displayed left-aligned in the centre of the screen. Participants read the target sentence to understand what happened next; their task was to indicate by means of three cursor keys, where the event described in the target sentence was located in time: '←' ^= past, '↑' ^= present, '→' ^= future (Response 1). For the sake of congruence with Experiment 2, the whole sentence was presented again immediately after Response 1, prompting participants to indicate the location again by pressing one of the cursor keys (Response 2). In order to encourage participants to read the contexts and targets carefully, half of the trials ended with a yes-no comprehension question that was answered by means of two designated keys (mean accuracy: 91%). A session lasted for about 20 minutes.

Trials in Experiment 2 with Stop-Reading began with a full presentation of the context, too. Once participants understood what was told in the context they pressed the space bar. Now the context was replaced by the target sentence displayed left-aligned in the centre of the screen, yet masked except for the first

### 6 The Russian perfective present in performative utterances

word; masked characters other than blanks were substituted by underscores. Participants could then read the target sentence from left to right in a word by word fashion (moving window technique): with the first press of the space bar the first word was masked and the second word was uncovered; with each subsequent press the current word disappeared and the following word showed up. In this way participants could proceed until the end of the sentence. However, beginning with the presentation of the first word of the target sentence, participants could stop reading at any time by pressing one of the cursor keys instead of the space bar if they felt able to indicate where the described event is located in time: '←' ^= past, '↑' ^= present, '→' ^= future (Response 1). Immediately after Response 1, the sentence was presented as a whole, prompting participants to indicate the location again via a cursor key (Response 2). Half of the trials ended with prompting an answer to a yes-no comprehension question (mean accuracy: 91%). A session lasted for about 30 minutes.

### **2.1.4 Main objectives**

It was of main interest where events described by performatives are located in time. Events described by performatives are expected to be located in the present if they are interpreted as a performative speech act; non-performative interpretations should lead to localizations in the future. Performatives with ipv verb aspect should therefore generally lead to localizations in the present. Performatives with pv verb aspect are expected to also lead to a substantial amount of localizations in the present. The greater the loss of performative power due to the pv aspect, the more reduced should be the frequency of localizations in the present. If the localization in time depends to a large extent on the verb aspect, i.e., on verb morphology, the localization should be quite insensitive to the remaining content of the target sentence. In particular, localizations should be unaffected by the possibility to stop reading.

### **2.2 Results**

The data were subjected to a generalized linear mixed model (GLMM) with a binomial link function, using the *lmer* function of the *lme4* package (Bates et al. 2015) for the R software for statistical computing (R Core Team 2014). When preparing the data for analysis, we had to realise that the performative target sentences for five of the 20 verbs deviated crucially from the stipulated structure in that the speech act verb was placed later than sentence-initially (see items 8, 13, 14, 18 und 20 in the stimuli; see link in footnote 7). One additional item, 6, had

### Anja Gattnar, Johanna Heininger & Robin Hörnig

to be dropped due to a wrong stress marking. The analysis is thus based on 14 performatives, with 6 and 8 items instantiating the same condition on the two lists. *α*-errors for *z*-values are marked as follows: \*\*\* if *p* < .001; \*\* if *p* < .01; \* if *p* < .05.

### **2.2.1 Response 2 in Experiments 1 and 2**

Localizations in the present or future are valid if occurring after performatives (98% and 96% valid in Exp.s 1 and 2); localizations in the past are valid if occurring after declaratives (88% and 89% valid in Exp.s 1 and 2). The proportions of valid localizations in the present are 81% versus 70% for ipv and pv aspect in Experiment 1 and 78% versus 52% in Experiment 2. The GLMM converged for random intercepts for participants and random intercepts and slopes for items. In addition to the two main effects of Aspect and Experiment, the interaction was also significant [Asp: *z* = 4.50\*\*\*; Exp: *z* = 2.54\*; Asp×Exp: *z* = 2.69\*\*]. Localizations in the present decreased from ipv to pv aspect more strongly with than without Stop-Reading (Exp. 2: 77 to 51%; Exp. 1: 81 to 70%), as shown in Figure 1.

Figure 1: Response 2 (present versus future) in Experiments 1 and 2 as a function of verb aspect

### 6 The Russian perfective present in performative utterances

### **2.2.2 Early versus late responses in Experiment 2**

Figure 2 shows how valid localizations in time accumulate across the regions of target sentences with ipv (left panel) and pv verb aspect (right panel). Numbers indicate the proportions of localizations in the present within the valid responses, i.e., disregarding continuations. Whereas we recognize no trend for the ipv aspect, it appears that for the pv aspect these proportions remain around 41% until they rise in the last region up to 51% for Response 2. To determine whether the increase is substantial, Response 2 was categorized as Early (if it matched Response 1 given earlier than region 8) or Late (if it matched Response 1 given on region 8 or revised earlier Response 1) and was subjected to a GLMM analysis with the fixed factors Aspect and Time (Early vs. Late). The GLMM converged for random intercepts (participants and items) and random slopes for Aspect (items). In addition to a strong effect of Aspect, Aspect interacted with Time [Asp: *z* = 3.61\*\*\*; Asp × Time: *z* = 3.40\*\*\*]. We take this interaction to show that the proportion of localizations in the present is indeed substantially larger for late compared to early responses in case of a pv aspect (71 vs. 41%; total *n*: 91 vs. 177); no such difference is obtained in case of an ipv aspect (74 vs. 78%; total *n*: 89 vs. 181).

Figure 2: Responses 1 and 2 in Experiment 2 as a function of aspect dependent on sentence position

In sum, the results substantiate the claim that the pv aspect on a speech act verb reduces its performative force compared to the ipv aspect, i.e., it reduces the

### Anja Gattnar, Johanna Heininger & Robin Hörnig

probability that a native speaker interprets the sentence containing it as to perform a speech act. However, the performative force of the verb is often preserved nevertheless, in that speakers frequently interpret the utterance of the sentence as a speech act. In addition, given a pv verb aspect, there is evidence that speakers more likely opt for a speech act interpretation after having processed the uttered sentence as a whole. This claim is supported by much more speech act interpretations in Experiment 1 without Stop-Reading than in Experiment 2 with Stop-Reading; further evidence comes from Experiment 2 in which speech act interpretations were more frequent if participants read the whole sentence compared to when they stopped reading before the end of the sentence. This might be taken to indicate that the aspect morphology of the sentence initial pv verb is in conflict with a speech act interpretation, with the latter prevailing in particular if based on a full interpretation of the sentence. In line with this, we observe that a valid Response 1 often persists in Response 2 in particular for localizations in the present, 95%, compared to localizations in the future, 82%.

### **3 Discussion and outlook**

The results of the experiments confirm our hypothesis that pv speech act verbs can be used in performative utterances, and may in principle substitute the ipv speech act verbs. Our investigation does not explain the restrictions of the class of pv verbs that can occur in performative utterances. Like Wiemer (2014) we tend to the opinion that pv performatives are lexicalized to a certain extent. For our investigation the evidence that pv speech act verbs are interpreted as present tense verbs is the most important result. In both experiments taken together about 60% of the pv speech act verbs were interpreted as present perfective. As not all of our speech act verbs were *verba dicendi* (for example 'to thank', 'to invite', 'to welcome', etc.), we may conclude, that not only *verba dicendi* but also other speech act verbs can be used in performative speech acts. Following our hypothesis, we have strong evidence that the present perfective speech act verbs own performative force. This is shown by the frequent present tense localizations of events denoted by our pv speech act verbs. Localization based on the full sentences promoted the localizations of the pv performatives in the present tense. We infer this from the comparison of the two experiments. Moreover, we found a late increase of locations in the present tense in Experiment 2 and a persistence of early localizations in the present. Therefore, we conclude that the sentence context plays an important role for the temporal localizations in the case of pv speech act verbs. The verbal aspect is thus not the decisive factor for the well-formedness

### 6 The Russian perfective present in performative utterances

of performative utterances in Russian. The interpretation as performative is also influenced by the particular semantics of the speech act verbs, the sentence embedding the speech act verb, and maybe the preceding context.

Summing up, we reach the following conclusions, which are in part preliminary and need further support:

First, pv speech act verbs can be used in performative speech acts, because, due to the available present tense interpretation, they fulfill condition 'present tense' that is inevitable to carry out a performative speech act. Second, the information conveyed by the sentence information following the pv speech act verb has an influence on the interpretation of the verb if it bears pv but not ipv aspect. Third, given the very low ratings of pv performatives without preceding context (see footnote 5), we suspect that the speech act interpretation also benefits from the preceding context. Evidence for this comes from the fact that the results with pv verbs in Experiment 2 increase in late present localization nearly to the rates for the ipv verbs. In the case of pv speech act verbs, we can even speak of an interaction between the successive enhancement of context information and the localization in the present. The pv speech act verb by itself may be crucial for present localization, but a more reliable localization is reached when the speech act verb is embedded in a wider context. The stronger performative power of the pv aspect in Experiment 1, where the whole speech act appeared before the decision, confirms how the quantity of the sentence information has influence on the decision.

As far as we know, this is the first experimental investigation on aspect use in Russian performatives showing that pv speech act verbs can be used in performative utterances. A next step would be to answer the question, whether and how the use of pv speech act verbs influences the sentence meaning in comparison to ipv speech act verbs. Like Israeli we tend to hypothesize a pragmatic difference between ipv and pv performative utterances; see example (12). It would be interesting to check whether an overt subject even strengthens the marking of authority in performative utterances. Another important consideration is the verbal semantics of ipv and pv speech act verbs. When we argue with Breu (1980) and De Wit (2017), we must also look at the verb immanent aspectual functions in which ipv and pv speech act verbs are different from each other. Following this line of reasoning, ipv speech act verbs would name and perform the performative event, whereas a pv speech act would emphasize the completion of the performative speech act. Both approaches are well worth pursuing and will give motivate further experimental investigation.

Anja Gattnar, Johanna Heininger & Robin Hörnig

### **Abbreviations**


### **Acknowledgements**

Special thanks go to Prof. Tatiana V. Chernigovskaya, head of the Laboratory for Cognitive Studies, St. Petersburg State University and her staff member Daria Chernova. They supported us in recruiting participants and placed a laboratory to our disposal for conducting the experiments. Furthermore, we wish to thank Tilman Berger, Stefan Heck, Eugen Kravchenko, Inna Pirina, Tatiana Perevozchikova and two anonymous reviewers for their helpful comments.

### **References**


6 The Russian perfective present in performative utterances


Anja Gattnar, Johanna Heininger & Robin Hörnig


### **Chapter 7**

## **The nature(s) of syntactic variation: Evidence from the Serbian/Croatian dialect continuum**

Peđa Kovačević University of Novi Sad

## Tanja Milićev

University of Novi Sad

The paper reports on a study of the variation inside the Serbo-Croatian dialect continuum with respect to clitic placement, complements of modal verbs (infinitive/da+present) and the use of *trebati* 'need' as either an experiencer verb or a simple transitive. Region and ethnicity accounted for a large portion of variation in the use of infinitives and da+present with many speakers using these structures interchangeably. Next, we found that clitics are almost uniformly placed after the first phrase. The variation in the use of lexical *trebati* was confined to the Croatian portion of the sample. Our findings suggest that (i) infinitives and da+present after modal verbs should be treated as roughly the same syntactic structure; (ii) variation in clitic placement should not be analyzed as an instance of sociolinguistic variation and deeper (linguistic) causes of variation should be pursued; (iii) *trebati* as a transitive verb appears in the Croatian variety only.

**Keywords:**syntactic variation, Serbo-Croatian dialect continuum, clitic placement, non-finite complements

### **1 Introduction**

### **1.1 Syntactic variation: Theoretical framework**

Recent theoretical approaches to syntactic variation have enabled us to form a more fluid picture of syntax (Adger 2006; Adger & Trousdale 2007; Adger &

Peđa Kovačević & Tanja Milićev. 2018. The nature(s) of syntactic variation: Evidence from the Serbian/Croatian dialect continuum. In Denisa Lenertová, Roland Meyer, Radek Šimík & Luka Szucsich (eds.), *Advances in formal Slavic linguistics 2016*, 147– 167. Berlin: Language Science Press. DOI:10.5281/zenodo.2545519

### Peđa Kovačević & Tanja Milićev

Smith 2005). Updates on the rather rigid classical Principles and Parameters theory (Chomsky & Lasnik 1995) like Kayne's (2000) Microparamteric approach or Kroch's (1994) Competing Grammars have struggled with the fact that syntactic variation can be quite free and apparently even optional in some cases. More recently, in line with more general theoretical advances, it has been argued that syntactic variation, with all its apparent fluidity, can be captured within the Minimalist Framework (Adger 2006; Adger & Trousdale 2007; Adger & Smith 2005).

This approach provides us with a way of looking at variation which predicts much greater freedom on the part of speakers to move between two different structures depending on the context. Furthermore, the approach is freed of the assumption that some speakers constantly move from one grammar to another as they produce different constructions.

Instead of relying on parameters and/or microparameters as explanatory mechanisms, Adger and colleagues assume that syntax is simply a set of uniform core operations applied to lexical items. What appears as syntactic variation, thus, arises when (i) there are two or more ways of pronouncing the same structure or (ii) when there are different uninterpretable morphosyntactic features on competing lexical items. Adger (2006) illustrates this with an example of different T heads that can be found in Standard English and dialects like Buckie English and others, which give rise to different spellouts of the auxiliary *be*. While in Standard English, T is sensitive to agreement and spells out the agreement patterns morphophonologically, in non-standard dialects, *be* is either completely insensitive to agreement (i.e. bears no uninterpretable phi-features) or simply does not spell out reflexes of agreement in the same way as in Standard English. Either way, a speaker can have both lexical items (T heads) in their mental lexicon and depending on which one they choose, the output will vary. The way the speaker employs these different lexical items is determined by sociolinguistic factors in the sense of Labov (1972).

This approach provides us with a way of looking at variation, which predicts much greater freedom on the part of speakers to move between two different structures depending on the context. Furthermore, the approach is freed of the assumption that some speakers constantly move from one grammar to another as they produce different constructions.

The Serbo-Croatian dialect continuum provides very useful testing ground for theories of syntactic variation. Our primary goal in this paper is to present some data from an empirical study of three instances of syntactic variation in this dialect continuum in order to arrive at a clearer factual description of the phenomena at hand. We will also provide sketches of formal analyses of these three structures, which will show that the approach developed by Adger and his co-workers is a very useful theoretical tool when it comes to explaining the observed data.

### 7 The nature(s) of syntactic variation: Evidence from Serbian/Croatian

### **1.2 Syntactic variation in the Serbian/Croatian dialect continuum**

It is a well-known fact that the differences between Serbian and Croatian standard varieties belong mostly to the lexicon and the domain of pronunciation (Corbett & Browne 2009; Bailyn 2010, *inter alia*). The most prominent differences in the realm of pronunciation have to do with the way in which speakers of these two varieties pronounce words that used to contain the so-called yat sound in the older varieties of the language. In modern Serbian, this sound is pronounced as /e/, while in modern Croatian it is either /je/ or /ije/, depending on the length of the earlier vowel. Based on these different pronunciations, the two standard varieties are also called Ekavian and Ijekavian. In terms of differences in vocabulary items, one can mention that due to historical factors the Croatian variety tended to borrow more from German, Czech and other languages of Central Europe, while Serbian contains more borrowings from Turkish and other languages of the Balkans (Corbett & Browne 2009).

Syntactic variation in this dialect continuum seems limited to just a few potential cases. One of the best known points of difference has to do with the structure of non-finite verbal complements. Example (1a) illustrates the option of infinitives functioning as complements of modal verbs while in (1b), the modal is followed by the so-called da+present structure. Standard grammars of Croatian draw a sharp distinction between the Serbian da+present option and the Croatian infinitives (Katičić 1986). However, Bailyn (2010) provides some empirical evidence to the effect that both varieties allow both options and infinitives are simply more common in Croatian.

	- b. Ivan Ivan mora must **da** da **pojede** eat.pres.3sg večeru. dinner 'Ivan must eat his dinner.'

Another area of potential syntactic variation would be the positioning of clitics. When it comes to clitics, standard Croatian grammars prescribe placing the clitics after the first word, a rule that is sometimes referred to as the 2W rule (Katičić 1986; for criticism see Peti-Stantić 2009). This rule is illustrated in (2b). In Serbian, the most neutral rule is to place the clitics after the first phrase, a rule known as the 2P rule (2a). Corbett & Browne (2009) suggest that the 2W rule is less common in the context of the clitic-second phenomenon because under 2W, the clitic cluster splits a constituent.

Peđa Kovačević & Tanja Milićev

	- b. Pravi true **je** aux.cl igrač player došao. come 'A true player has come.'

Regarding the verb *trebati* 'need', we find that Standard Croatian grammars recognize its existence as a transitive verb (3a), while in Serbian, it appears only as an experiencer verb (3b). Also, as a modal verb, in Serbian *trebati* is prescribed as being always impersonal (4b), as opposed to Croatian (4a).

	- b. Ivanu Ivan.dat **treba** need.imp knjiga. book.nom 'Ivan needs (some) cheese.'
	- b. Deca children.nom.sg **treba** need.imp / \***trebaju** need.3pl da da odu. go.3pl 'Children need to go.'

Standard grammars of both Croatian and Serbian often focus on essentially eliminating the variation and prescribing one option as "more natural" for a given variety. Therefore, in a sense, they present an overly rigid either–or, binary picture of variation in these domains.

In order to make sense of the variation in these domains, one needs to have a clear picture of the underlying facts, which we claim are not correctly represented in descriptive grammars. Therefore, the primary aim of this research is to provide some empirical insight into the nature of variation in these three aspects. Next, we will argue that the data point towards the view of variation proposed by Adger & Trousdale (2007) and Adger & Smith (2005). Finally, we will suggest ways of analyzing these constructions formally based on the implications that arise from this particular view of variation.

### 7 The nature(s) of syntactic variation: Evidence from Serbian/Croatian

### **2 Empirical data: Production study**

The enumerated instances of potential syntactic variation in the Serbo-Croatian dialect continuum were confirmed by a simple production study.<sup>1</sup> The production task consisted of a written survey that elicited structures like (1)–(3). The nonfinite complements of modals were elicited by means of sentences like (5) where the target verb appears in the first part of a compound sentence in its finite form. In the second part of the sentence, the same verb is supposed to appear in its non-finite form after a modal, but a blank is given in its stead. The participants were instructed to fill in the blank with the form of the underlined verb that they found most suitable. They were also instructed not to leave out the verb because those sentences are grammatical even when the verb is elided. There were 20 sentences in total, and the targeted non-finite structures were placed in the contexts of modals *moći* 'can', *morati* 'must', the phasal verb *početi* 'start', and the verb *želeti* 'want'. Five target sentences were dedicated to each of these contexts.

(5) Milan Milan je is pojeo ate salatu, salad, a while Ivan Ivan još still mora must \_\_\_\_\_\_\_\_desert. \_\_\_\_\_\_\_\_dessert 'Milan ate the salad while Ivan still has to (eat) the dessert.'

When it comes to the variations in clitic placement, the task was to shift sentences like (6a) into past tense (6b). As can be seen in the examples in (6), the sentence in the present tense does not contain clitics, but in the past tense, the auxiliary clitic *je* 'is' is necessary. However, the position of the clitic can be varied as indicated in the example. It can either come immediately after the demonstrative *ta* 'that', or it can come after the subject noun phrase *ta gospođa* 'that lady', in accordance with 2W or 2P rules respectively. There were 12 target sentences in total and the sentences were organized into four groups according to the type of the prenominal modifier (demonstrative, descriptive adjective, possessive or demonstrative adjective). Each modifier appeared in all three genders (masculine, feminine, neuter).

	- b. Ta that {**je**} aux.cl gospođa lady {**je**} aux.cl pravila made kolače. cookies 'That lady made cookies.'

<sup>1</sup>A detailed description of the design, including all the experimental items, can be found at https://osf.io/m5feh.

### Peđa Kovačević & Tanja Milićev

Finally, the verb *trebati* 'need' was elicited by means of asking a question where the most likely response will be a sentence containing this verb. However, the crucial thing to worry about was avoiding the use of this verb in the question itself because the way the verb is used in the question would have a great impact on how it would be used in the answer. Because this was a written task, it was possible to provide one or two sentences in the way of context and then ask the question like (7a) using a verb other than *trebati*, but also state that the verb *trebati* should be used in the answer.

	- b. Petar Peter.nom treba need.3sg olovku. pencil.acc 'Peter needs a pencil.'
	- c. Petru Peter.dat treba need.imp olovka. pencil.nom 'Peter needs a pencil.'

The possible answers to the question in (7a) were either (7b) or (7c). The choice of one over the other would reveal the way the participant uses the verb *trebati* in his or her everyday speech. Because we studied the variation in the use of this one verb only, we had only three target sentences that we wanted to elicit.<sup>2</sup>

When it comes to the choice of participants, we were interested in the way in which geographic location and ethnicity influenced the use of these constructions. Our sample consisted of 120 participants from Serbia and Croatia, ages 16–19. They were divided into four groups with 30 participants each. One group consisted of 30 students attending the so-called gymnasium school (*gimnazija*) in Zagreb. One group was located in the town of Ruma, roughly 60 kilometers west of Belgrade. This group also consisted of 30 gymnasium students. Finally, there were two groups in the town Subotica, in the north of Serbia, on the border with Hungary. The reason why we had two groups in this town was because in Subotica, there was the option of varying the ethnicity of the participants while controlling for their geographic location. Namely, the gymnasium in this town

<sup>2</sup>An anonymous reviewer points out that the presence of a dative argument in the elicitation question could have primed the subjects to also use a dative in the response with the verb trebati. This might have reduced the number of transitive uses. The fact that the Zagreb group still largely opted for the transitive *trebati* (as opposed to groups from Serbia) shows that the possible priming effect was not nearly strong enough to suppress the transitive use.

### 7 The nature(s) of syntactic variation: Evidence from Serbian/Croatian

has a Croatian track alongside the Serbian track. What this means is that students have the option to enroll in classes that are taught in standard Croatian, and a number of ethnic Croats choose this option. In order to vary the ethnicity of the students while controlling for geographic location, we created one group of 30 students from the Croatian track and one group of 30 students from the Serbian track.

It is important to note that our sample was constructed in such a way as to compare the dialect spoken in Zagreb with the dialects spoken in Vojvodina, the Northern Province of Serbia. Zagreb was taken as a benchmark representing a dialect close to the Croatian standard (the participants were students of the Classical Gymnasium in Zagreb, a very prestigious school with a strong focus on languages). Towns in Vojvodina, on the other hand, were of interest to us because they represent the kind of gray area between the Croatian and the Serbian standard where one can zoom in on the speakers who speak neither of the standards but are quite close to both of them at the same time.

Finally, it should be pointed out that this was a pilot study into the vast realm of syntactic variation. We believe, though, that it gives a good starting point towards the understanding of patterns in variation when it comes to the syntactic structures we focused on.

### **3 Findings**

The empirical data that we obtained pointed to quite different patterns of variation in the three structures under investigation. Concerning the variation in nonfinite complements, we compared our groups based on the number of infinitives that each participant produced. In Figure 1, mean values for the number of infinitives are given for each group. The results from the groups from Subotica are

Figure 1: The average number of infinitives across groups

### Peđa Kovačević & Tanja Milićev

given in the middle of the graph with SerbianClass standing for the group made up of students attending the Serbian track and CroatianClass stands for students attending the Croatian track.

As the graph in Figure 1 suggests, there are important differences in the use of infinitives as non-finite complements across these groups. These differences are statistically significant (LR *p* < 0.01,*r* <sup>2</sup> = 0.63). Despite the fact that the group in Zagreb used infinitives almost exclusively, we can conclude that these structures can vary quite freely in the production of a significant number of speakers. The histogram in Figure 2, which shows how the use of infinitives was distributed across the entire sample provides a deeper insight into the nature of the variation in this area.

Figure 2: Frequency distribution for the number of infinitives in the entire sample

In Figure 2, the x-axis represents the numbers of infinitives used in the target sentences while the height of the bar shows how many participants who produced a particular number of infinitives there were. As the graph shows, a large portion of the participants either used infinitives throughout, or they systematically avoided them. The height of the bar above zero on x-axis shows the portion of participants who did not use infinitives at all while the height of the bar above 20 on the same axis indicates the share of the subjects who used infinitives only. However, there is also a sizable portion of the sample where these structures are in quite free variation. In other words, for many participants there were no clear preferences for either infinitive or da+present. A closer look at the surveys done by some of these participants reveals no discernible pattern or context-dependent preference for one of the structures.

When it comes to the variation in clitic placement, we obtained very different results. In our survey, there were 12 target sentences eliciting one or the other

### 7 The nature(s) of syntactic variation: Evidence from Serbian/Croatian

clitic placement option. In Figure 3, we plotted the mean numbers of sentences in which 2P rule was observed. The means are given for each of the four groups.

Figure 3: The average number of applications of 2P rule across groups

Simply by inspecting the graph visually, one can notice that the pattern of variation was different from what was observed with non-finite complements. The mean values for each of the groups are close to the maximum of 12, and Linear Regression found no statistical difference among the groups (*p* = 0.205, *r* <sup>2</sup> = 0.0136). A more detailed look at the surveys reveals that a very small number of participants did produce several instances of the 2W rule, but these were rather marginal as the preponderance of participants in all four groups used the 2P rule only.

Turning now to the variation in the use of the verb *trebati*, we can say that whatever variation there is in the use of this verb, it is confined to the Croatian variety. All the participants from Serbia (both groups from Subotica and the group from Ruma), used this verb in its experiencer-like form. There were no instances of this verb used as a simple transitive in these three groups. On the other hand, we found that there is substantial variation in the use of this verb within the group from Zagreb. About a third of the elicited utterances containing the verb *trebati* where characterized by the simple transitive use (the mean value was 1.06 with 3 being the maximum). Curiously, it was not the case that out of 30 participants approximately a third used *trebati* as a transitive verb exclusively and the remaining 20 participants used this verb only in its experiencer version. The instances of *trebati* as a transitive verb were much more distributed within the group with some speakers using this verb two times as an experiencer verb and once as a transitive one. Of course, there were also those who produced two sentences with a transitive *trebati* and one with its experiencer-like counterpart. Crucially, the outcome was that, in fact, only less than a third of the participants from Zagreb consistently used *trebati* as an experiencer verb with no instances of its transitive version.

Peđa Kovačević & Tanja Milićev

### **4 Analysis**

Armed with these empirical insights about the patterns of variation with these three constructions, we can turn to the question of what these insights can tell us about their underlying structure. Also, we might be able to derive some suggestions as to the broader theoretical questions dealing with the nature of syntactic variation hinted at in the introduction. These will be the topics of this section.

### **4.1 Infinitive vs. da+present**

On a general note, one can say that the da+present construction has received much more attention in the syntactic literature than infinitives (Todorović 2012; Mišeska-Tomić 2004). The fact that these two structures can be found in virtually free variation is rarely addressed (see Belić 2005 for exceptions). Todorović & Wurmbrand (2015) note that the *da* particle found in da+present constructions can function as a complementizer (8), a modality marker (9) and finiteness marker on *v* (10).

(8) Jovan Jovan je aux.3sg tvrdio claimed da da čita read.pres.3sg knjigu. book 'Jovan claimed to be reading the book.'

(from Todorović & Wurmbrand 2015)


Even though under certain conditions (9) would also allow an infinitive after the main verb, our study focused on structures like (10), where infinitive alternates with da+present most clearly.<sup>3</sup> Todorović & Wurmbrand (2015) treat infinitives

<sup>3</sup>Example (8) does not allow the alternation with infinitives, while the tense of the embedded clause can be varied, and the reference of the subject of the embedded clause is not tied to the reference of the matrix clause subject. Sentences like (9) allow infinitives and da+present after the main verb only if the subjects of the matrix structure and the embedded structure are (referentially) the same. Having a (referentially) different subject is possible, but with da+present only. Sentences like (10) never allow referentially different subjects in the embedded and the matrix part. We leave the variation of infinitives and da+present in sentences like (9) for further research.

### 7 The nature(s) of syntactic variation: Evidence from Serbian/Croatian

as bare VPs based on the fact that they seem to be unable to assign accusative case, typically associated with the causative *v*. That way, they postulate a syntactic difference between these two structures because sentences like (10) contain a full *v*P at least. Under their account sentences like (11), where infinitive is used as the complement of the phasal verb, should have a bare VP in the embedded part. They claim that the accusative case on the object of the infinitive is assigned by the matrix verb.

(11) Marko Marko je aux.3sg počeo started raditi do.inf zadatak. homework 'Marko started doing his homework.'

However, because the data illustrate the possibility of completely free variation, we will propose that both "low-da", corresponding to (10), and infinitives have the same structure: both are *v*Ps.<sup>4</sup> If infinitives and da+present were truly different structures, one would not expect to find speakers who use them interchangeably, as we did. A deeper structural difference of the *v*P / VP kind would give rise to clear preferences for one structure over the other either across regional varieties or, at the very least, across individual speakers.

There might also be some syntactic evidence against the claim that infinitives are merely VPs. The main piece of evidence is the availability of accusative case with infinitives in contexts where it is difficult to argue that the accusative is assigned by the matrix verb: copular constructions (12) and impersonals (13). If the ability to assign accusative is taken as a diagnostic, we should conclude that infinitives, like the "low-da" structures are *v*Ps.


Once both infinitives and da+present are reduced to essentially the same structure (i.e. *v*P), we can look at them as simply different instantiations of the same *v* 0 . The proposed structures for infinitives and da+present are in (14).

<sup>4</sup>By using the term "free variation", we refer to structural alternations that have no consequences for the semantics and pragmatics of the sentence a whole.

If infinitives and da+present are merely different instances of the same *v* 0 , we can expect the kind of variation that we observed in our study. In line with the general view in Adger & Smith (2005), we can assume that there are some speakers whose mental lexicons contain both of these *v* heads, which is why they can use them interchangeably.

The structures in (14) raise one additional problem. Namely, it is unclear what the status of the embedded subject with infinitives and da+present should be. Although both da+present and infinitives are subject to the same constraints regarding the interpretation of the null subject (obligatory control, sloppy reading only, etc.), in impersonal constructions, we note a clear asymmetry with respect to the impersonal (reflexive) morpheme *se*. With da+present, *se* (and in fact all

### 7 The nature(s) of syntactic variation: Evidence from Serbian/Croatian

kinds of pronominal clitics) obligatorily stays inside the *v*P, whereas with infinitives, it shows up with the matrix verb; see (15).

	- b. \* Juče yesterday **se** refl moglo can.past.3sg da da peva. sing.3sg Intended: 'It was possible to sing yesterday.'
	- c. Moglo can.past.3sg.n **se** refl pevati. sing.inf 'It was possible to sing.'

While the behavior of *se* with da+present supports our assumption for the existence of a null element in Spec*v*P (which needs to be targeted/"switched off" by *se* impersonalization), the fact that *se* surpasses the infinitive predicate poses a problem for the uniform structural treatment of the infinitive and da+present, and brings into the question the postulation of the *v*P layer in infinitives.<sup>5</sup> It is also possible that infinitives simply lack Spec*v*P, which would still retain structural uniformity.

Wurmbrand (2003) and Todorović & Wurmbrand (2015) argue that there is no PRO with (restructuring) infinitives, i.e. that infinitives lack a syntactic subject altogether, and that interpretation comes from the matrix subject. However, a simpler way of capturing the relevant facts would be by postulating a difference in terms of the presence/absence of Spec*v*P rather than saying that infinitives lack the *v*P layer completely. Again, saying that there is no *v*P with infinitives would leave sentences like (12) and (13) unexplained.

Impersonal contexts again provide us with evidence that the interpretation of the external argument of the infinitive is dependent on interpretation of the matrix predicate subject. Namely, infinitives are only possible with impersonal *se*. If *se* is absent, as in (16), only predicates without a referential subject (e.g. weather verbs, such as *grmeti* 'thunder' in (16c)) are possible.<sup>6</sup> No such restrictions hold for da+present (16b), where *se* obviously takes care of getting the proper interpretation for the embedded predicate (indefinite, human).

(16) a. \* Moralo must.past.n / Moglo can.past.n je aux.3sg pevati. sing.inf Intended: 'One had to / could sing.'

<sup>5</sup>Krapova's (1999) analysis of a structure virtually identical to da+present also assumes the existence of PRO in those contexts.

<sup>6</sup>Presumably, the subject of these matrix predicates is a kind of expletive *pro*.

### Peđa Kovačević & Tanja Milićev


Curiously, copular constructions and impersonal *trebati* 'need', which also lack an overt matrix subject, show no such restrictions with respect to the infinitive (cf. (12) and (13)). It is possible that in these contexts we are dealing with what Wurmbrand (2003) calls "non-restructuring" configurations. These configurations would then be *v*P infinitives, which are different from the restructuring (VP) ones found after modals and verbs such as*try* or *begin*. At this point, we cannot provide a definitive resolution of this issue. The crucial test for a true case of restructuring is the availability of long passive. However, long passive in Serbian is possible only with impersonal se-passive (cf. Todorović & Wurmbrand 2015), while be-passive is not allowed in these constructions. The examples in (17) show the unavailability of long be-passive, see (17a) and (17b), with both infinitive and da+present complements together with the acceptable se-passive versions.


We leave open the question of obligatory control and whether the "PRO interpretation" of the infinitive requires a syntactic position or not.

It should be noted that some speakers report subtle differences in meanings of these two constructions. Examples like (18) illustrate some of these subtle differences. For speakers from central Serbia, these examples can mean simply negated future. For many speakers from Vojvodina, however, these sentences mean the

### 7 The nature(s) of syntactic variation: Evidence from Serbian/Croatian

lack of volition with both present and future temporal reference. These speakers prefer to use the infinitive for the meaning of negated future.

(18) Mi we nećemo neg.will.1pl to that da da radimo. do.1pl 'We will not do that.'

In sum, once we assume that da+present and infinitive are two versions of the same *v* head, it becomes possible to explain the variation between the two as a consequence of the roughly equal availability of these two heads in the mental lexicons of such speakers. Also, the reason why some speakers consistently use one and never the other would be because their mental lexicon contains only one variety. At this point, the suggestion is to treat them as the same underlying structure.

### **4.2 Clitics**

Concerning the difference between the 2W and 2P rules in the placement of clitics, one can identify two approaches. In one of the views, these two options are the same in terms of their underlying form (Ronelle 2006; 2008; Yu 2008). The difference, then, stems from the application of two different phonological processes, one of which inserts the clitics after the first word while the other one inserts the clitics after the first phrase. Crucially, these phonological processes apply differently across the dialect continuum. The difference, thus, seems to be understood to be purely sociolinguistic. Anderson (2005), surprisingly, even suggests that the use of 2W rule is not possible in Serbian, where only the 2P rule can be found.

Other authors argue that this difference is not purely sociolinguistic in nature. For instance, Diesing et al. (2009) argue that sentences in which the 2W rule is used have a marked pitch contour, which suggests prosodic focus on the prenominal modifier. Moreover, such sentences are claimed to be felicitous only in contexts where the prenominal modifier is contrastively focused. The 2P rule, on the other hand, is applicable to broad focus contexts and is, thus, interpreted as unmarked. Bošković (2009) proposes different syntactic derivations for the two rules. In his view, the 2W rule is derived by left-branch extraction of the prenominal modifier which then functions as an anchor for the clitic. Our own intuitions suggest that answers like (19b), where the 2W rule is applied, are not necessarily infelicitous in response to questions like (19a), which are a clear indication of a broad focus situation.

Peđa Kovačević & Tanja Milićev

	- b. Onaj that je aux.cl čovek man došao come kasno. late 'That man came late.'

In that sense, we do not agree with the restrictive differentiation given by Diesing et al. (2009) although we think that at least in the Serbian variety, the 2W rule carries some additional pragmatic or semantic cues. Crucially, these additional meanings should not be interpreted in terms of contrastive focus.

The results that we obtained in the study clearly point towards the approach taken by this second group of authors who disagree with the idea that the difference between 2W and 2P is merely sociolinguistic. If that were the case, we would see a clear pattern of difference in the number of instances of the 2W rule among our four groups similar to what we observed with infinitives and da+present. However, the results show no statistical significance in the way in which clitics are placed within the sample. Virtually all participants opted for the 2P rule. This would not come as a surprise to those who claim that 2W and 2P sentences are different in terms of their syntax (Bošković 2009) and/or semantics and pragmatics (Diesing et al. 2009). The reason why the results are not surprising under the second set of accounts is because our sentences were given without additional contextual information, which would be needed to elicit 2W sentences.

### **4.3 The verb** *trebati* **'need'**

Our data show very clearly that the variation between transitive and experiencer *trebati* 'need' is confined to the Croatian variety. What is more, ethnicity does not play a major role in the use of this verb. This was shown by the fact that there were no instances of *trebati* as a transitive verb in the ethnically Croatian group from Northern Serbia. In that sense, variation is determined by regional factors.

As transitive *trebati* never occurs in Serbian, the simplest assumption then is that many Croatian speakers have two different lexical items, which some of them may use interchangeably, which is again in line with the approach to variation adopted here. However, before we dismiss the variation with *trebati* as uninteresting and too straightforward, we need to point out the change we note with the modal *trebati* in Serbian. Even though standard/prescriptive grammars and practices go to great lengths to preserve its special status as the only modal

### 7 The nature(s) of syntactic variation: Evidence from Serbian/Croatian

which can occur only as an impersonal, the speakers of Serbian show more and more agreeing patterns, whereby *trebati* agrees with the fronted (topicalized or focalized) embedded subject.

	- b. Devojke<sup>1</sup> girls.pl.f **je** aux.3sg **trebalo** needed.sg.n [da da t<sup>1</sup> otpevaju sing.3pl tu that pesmu]. song 'Girls should have sung this song.'
	- c. Devojke<sup>1</sup> girl.pl.f **su** aux.pl **trebale** needed.pl.f [da da t<sup>1</sup> otpevaju sing.3pl tu that pesmu]. song 'Girls should have sung this song.'

The source of variation in these examples is very interesting because it might be linked to the similarities and differences in the structure of infinitives and da+present complements discussed in this paper. Namely, the Croatian equivalent of (20a) is (21) where the modal *trebati* has to agree with the subject.

(21) Djevojke girl.pl.f **su** aux.pl **trebale** needed.pl.f otpjevati sing.inf tu that pjesmu. song 'It was needed / necessary that girls sing this song.' / 'Girls should have sung this song.'

The infinitival complement in (21) is incapable of hosting an overt subject and, as Todorović & Wurmbrand (2015) argue, it is quite possible that they do not even project a syntactic position capable of hosting a subject. Therefore, in Croatian, the subject would have to be base generated with *trebati,* which is why we observe agreement on the modal. On the other hand, da+present complements always project a Spec*v*P position, which is sometimes occupied by PRO and sometimes it hosts an overt subject. In (20a), for instance, we find an overt subject with da+present, hence, the modal *trebati* is impersonal. However, if da+present and infinitives are in the process of becoming the same structure, as we argued here, the system is forced to accomodate, which is why we are observing the development of a personal use of the previously impersonal modal *trebati*. Based on these facts, we could speculate that the development of a transitive use of the lexical verb *trebati* is linked to this difference in the modal use, but further research is needed to establish this relationship more firmly.

Peđa Kovačević & Tanja Milićev

### **5 Conclusions**

In conclusion, we have provided empirical evidence that some speakers belonging to the Serbian/Croatian dialect continuum can alternate between infinitives and da+present constructions without any restrictions even in a written production task. This fact was taken to mean that these are the same underlying structures. Additional syntactic evidence pointing to the same conclusion was also provided. We left open the question of the existence of an active Spec*v*P position with infinitives as we found some suggestions that the nature of the embedded subject with infinitives and da+present might be different in certain respects.

A different pattern of variation was found with respect to 2W and 2P clitics. Namely, previous accounts that tie the difference between 2W and 2P clitics to sociolinguistic considerations would predict sharp differences among the four groups of participants in our sample in terms of the use of these two rules for clitic placement. However, such differences were not observed and virtually all participants used the 2P rule exclusively.

Finally, variation in the use of the verb *trebati* as an experiencer verb and as a simple transitive was observed only within the group in Zagreb. No instances of this verb used as a simple transitive have been observed in the groups from Serbia, including the group made up of students with the Croatian ethnic background. In this domain, variation is determined by regional rather than ethnic factors. Also, many speakers who produced sentences with *trebati* as a transitive verb used it as an experiencer verb as well. We have suggested that there are two competing lexical entries for the verb *trebati*, one specified as a transitive verb and the other specified as an experiencer verb, in the mental lexicons of many speakers of Croatian.

The results obtained show a high degree of flexibility in the use of certain syntactic structures like da+present and infinitives. A significant share of the participants in the study used these structures interchangeably in a controlled production study (i.e. a fixed sociolinguistic context) without any obvious consequences for the semantics and pragmatics of the resulting output. Such a high degree of flexibility is surprising under traditional approaches to syntactic variation where different output structures are expected to arise from different sociolinguistic contexts and/or have different meanings. On the other hand, Adger's (2006) approach creates a much more fluid picture where certain speakers are expected to use different structures interchangeably often without any consequences for the meaning and speaker's decision to use one structure instead of the other is not necessarily triggered by a change in the sociolinguistic context. Since this is precisely what we found with respect to the use of infinitives and

### 7 The nature(s) of syntactic variation: Evidence from Serbian/Croatian

da+present in many speakers of Serbo-Croatian, broader theoretical implications of this study can be found in the fact that it fits into this more fluid picture of syntactic variation proposed by Adger (2006).

### **Abbreviations**


### **Acknowledgements**

The work on this paper was supported in part by the project number 178002, entitled *Languages and Cultures in Space and Time,* and project number TR32035, entitled *Development of Dialogue Systems for Serbian and other South Slavic Languages,* both financed by the Government of Serbia.

### **References**


### Peđa Kovačević & Tanja Milićev


7 The nature(s) of syntactic variation: Evidence from the Serbian/Croatian dialect continuum


### **Chapter 8**

## **On the lack of φ-feature resolution in DP coordinations: Evidence from Czech**

### Ivona Kučerová

McMaster University

The paper investigates a feature valuation in the context of more than one accessible goal. Concretely, the paper provides novel empirical evidence that there is no φ-feature resolution in syntactic agree. The apparent feature resolution of gender and number agreement previously reported in the Slavic literature on agreement with coordinated DPs is a side-effect of morphological realization of person feature that arises at the syntax–semantics interface. Furthermore, the proposal suggests that even non-default overt morphological marking of agreement might not faithfully reflect the narrow-syntax feature valuation, a result which seriously questions the validity of some core generalizations about agreement properties of natural languages. The core data comes from the agreement with coordinated noun phrases in Czech.

**Keywords:** agree, multiple agree, feature resolution, pronouns, copular clauses, Czech

### **1 Introduction**

The Minimalist Program (Chomsky 1995) shifted the focus of the syntactic investigation from lexical categories to their feature composition, which in turn yielded a growing interest in relations among syntactic features themselves, specifically, the notion of agree (Chomsky 2000; Chomsky 2001; among others). More recently the debate has increasingly concentrated on the status of valued and unvalued features (Pesetsky & Torrego 2007) and the notion of feature valuation in and of itself. This paper addresses the question of whether syntactic agree can only copy and share existing values of features, or whether narrow syntax can derive new values of syntactic features.

Ivona Kučerová. 2018. On the lack of φ-feature resolution in DP coordinations: Evidence from Czech. In Denisa Lenertová, Roland Meyer, Radek Šimík & Luka Szucsich (eds.), *Advances in formal Slavic linguistics 2016*, 169–191. Berlin: Language Science Press. DOI:10.5281/zenodo.2545521

### Ivona Kučerová

The question does not directly arise in the work that investigates structures with a single accessible goal. There the focus is on the distinction of matching and valuation (Béjar & Rezac 2003, Pesetsky & Torrego 2007) and the question of infallibility of these operations (e.g., the notion of failed agree in Preminger 2009). The question becomes more intricate in the domain of investigation of syntactic structures with more than one accessible goal. While for the work on Multiple-Agree (Hiraiwa 2005), it is critical that feature values within the same agree link *must match*, the literature on agreement with coordinated DPs works instead with the assumption that narrow syntax *may derive* new values by combining conflicting feature values within an agree link (Farkaş & Zec 1995; King & Dalrymple 2004; Heycock & Zamparelli 2005; Marušič et al. 2015).<sup>1</sup>

The paper provides novel empirical evidence that there is no φ-feature resolution in syntactic agree. The apparent feature resolution of gender and number agreement previously reported in the literature is a side-effect of morphological realization of person feature that arises at the syntax-semantics interface. Furthermore, the proposal suggests that even non-default overt morphological marking of agreement might not faithfully reflect the narrow-syntax feature valuation, a result which seriously questions the validity of some core generalizations about agreement properties of natural languages. The core data comes from the agreement with coordinated noun phrases in Czech.

### **2 Feature resolution in the Czech agreement system**

Standard Czech<sup>2</sup> distinguishes three grammatical genders, i.e, masculine (m), feminine (f), neuter (n), and two grammatical numbers, i.e., singular sg, plural pl. In addition, masculine gender is marked for animacy, i.e., there is a specialized case and agreement marking for animate (ma) and inanimate (mi) masculine nouns and the elements that morphosyntactically agree with them. While the ultimately four-way distinction is fully preserved in singular agreement and case marking, there is a partial syncretism in plural. The system distinctly marks neuter plural and masculine animate plural but collapses the distinction between mascu-

<sup>1</sup>The existing approaches to agreement with coordinations range from strictly morphosyntactic, as in Marušič et al. (2015), to strictly semantic, as in Lasersohn (1995). A majority of the current approaches combines both morpho-syntactic and semantic derivation, as pioneered in Farkaş & Zec (1995).

<sup>2</sup> I use the label Standard Czech for a non-vernacular variety of an interdialect shared by most native speakers of Czech and based on the modern codified standard of the Czech language.

### 8 On the lack of φ-feature resolution in DP coordinations

line inanimate and feminine.3, <sup>4</sup> The richness of the morphological marking thus lends itself easily to investigating agreement with coordinated noun phrases.

According to the existing grammatical descriptions (e.g., Panevová & Petkevič 1997), if nominal conjuncts differ in their φ-features, the agreement with both conjuncts is resolved along a markedness hierarchy, sensitive to animacy and gender marking.<sup>5</sup> Thus, animate masculine is the most marked feature, with masculine inanimate and feminine ranked over neuter. This means that if one of the conjuncts is masculine animate, the plural agreement is going to be masculine animate, as shown in (2).<sup>6</sup> If there is no masculine animate noun but one of the conjuncts is masculine inanimate or feminine, the plural agreement is the syncretic masculine inanimate/feminine agreement, as shown in (3). The order of the conjuncts does not affect the agreement pattern.<sup>7</sup> For simplicity of the presentation, I refer to the former agreement pattern as animate agreement and the latter one as gender agreement.

(1) **Feature-resolution markedness**

animacy (ma) ≻ gender (mi/f) ≻ neuter (n)

	- a. {Kočka cat.f.sg / kotě kitten.n.sg / dobytek} cattle.mi.sg a and pes dog.**ma** jedli ate.pp.**ma**.pl ze from stejné same misky. bowl

'The cat/kitten/cattle and the dog ate from the same bowl.'

f/n/mi + **ma** = **ma (animate)**

<sup>3</sup>The fact that feminine is collapsed with masculine inanimate in and of itself provides a strong indication that animacy plays no role in the syntactic construal of the feminine value of the gender feature.

<sup>4</sup>The syncretism pattern plays out somewhat differently in dialects, see, e.g., Karlík et al. (2002: 392–404), for morphological features that distingues Bohemian dialects from their Moravian counterparts (Central and Eastern Moravian). Discussing the dialectal variation goes beyond the scope of this paper but a preliminary exploration is attempted in section 5.

<sup>5</sup>Czech allows both first-conjunct agreement and agreement with both conjuncts. For now I leave the first-conjunct agreement pattern aside as it does not directly inform the empirical description of the feature resolution.

<sup>6</sup>Data with simple agreeement patterns are based on my native speaker intituitions and existing grammar descriptions (primarily, Panevová & Petkevič 1997; Corbett 1983). Data testing for combinations of features are based on elicitation of grammatical judgements from 4–6 native speakers.

<sup>7</sup>The (b) orders tend to be judged as less natural, a fact related to the asymmetric nature of coordinated noun phrases (see, e.g., Johannessen 1996), unless the ordering becomes relevant.

### Ivona Kučerová

b. Pes dog.**ma** a and {kočka cat.f.sg / kotě kitten.n.sg / dobytek} cattle.mi.sg jedli ate.pp.**ma**.pl ze from stejné same misky. bowl 'The dog and the cat/kitten ate from the same bowl.'

f/n/mi + **ma** = **ma (animate)**

### (3) **gender (mi/f)** ≻ **neuter (n)**

a. Kotě kitten.n.sg a and {kočka cat.**f**.sg / dobytek} cattle.**mi**.sg jedly ate.pp.**{mi/f}**.pl ze from stejné same misky. bowl 'The kitten and the cat/cattle ate from the same bowl.' n + **mi/f** = **{mi/f} (gender)** b. {Kočka cat.**f**.sg / dobytek} cattle.**mi**.sg a and kotě kitten.n.sg jedly ate.pp.**{mi/f}**.pl ze from stejné same misky. bowl

'The cat/cattle and the kitten ate from the same bowl.'

n + **mi/f** = **{mi/f} (gender)**

Upon a closer examination the markedness behaviour is rather puzzling. In other domains that involve a feature resolution along the markedness hierarchy, if there is a conflict, the system resorts to the less marked feature. This is not the case here. Not only does the masculine animate systematically emerge as the winner even though in other domains it is morphologically the most marked feature, neuter that in other environment behaves as the morphologically least marked feature, e.g., the feature used in failed-agree environments with no syntactic probe, as in (4), never survives in coordination agreement patterns.

	- that Peter neg.came nebylo neg.was.pp.n.sg dobré. good.n.sg 'That Peter didn't came wasn't good.'

One could argue that neuter cannot participate in a syntactic resolution because it is in some sense defective. Such a conclusion goes in line with the following

### 8 On the lack of φ-feature resolution in DP coordinations

observation. Not only does neuter never win in a combination with other gender values, but neuter-plural agreement arises only if both conjuncts are in neuter plural, as shown in (5). If either or both of the conjuncts are in neuter singular, the plural agreement cannot be in neuter plural, despite the fact there is a dedicated neuter plural agreement morphology. Instead, the agreement is the syncretic gender agreement.

(5) a. Kotě kitten.n.sg a and štěně puppy.n.sg {jedly ate.pp.{mi/f}.pl / \*jedla} pp.n.pl ze from stejné same misky. bowl

'The kitten and the puppy ate from the same bowl.'


The question of why animate masculine should behave as if it were less marked than inanimate masculine remains. The overall agreement-resolution pattern is summarized in Table 1. 8

### **3 The puzzle: Different probe = different feature resolution**

One could dismiss the emergence of the masculine animate plural agreement as insignificant, if it was not for an additional and much more serious empirical

<sup>8</sup> For Panevová & Petkevič (1997), the plural agreement for the first conjunct being mi is mi. Since there is no empirical evidence that mi.pl and f.pl are distinct, I use the descriptive mi/f label instead. The same for the first conjunct being f.

### Ivona Kučerová

Table 1: Agreement-resolution patterns (adapted from Panevová & Petkevič 1997)


problem. The generalization reported in the literature are strictly based on examples in the past tense. The past tense in Czech is morphologically realized by a finite auxiliary that agrees in person and number and is null for 3rd person and a past participle that agrees in number and gender with a structural subject in nominative case. Strikingly, the feature-resolution generalization reported in the previous section does not extend to other constructions in which we see plural agreement in gender, i.e., agreement with adjectival predicates and passive participles.

Agreement with adjectival predicates and passive participles plays out rather differently. As it turns out, if the gender features on conjuncts do not match, plural agreement is fully grammatical only if one conjunct is masculine animate and the other conjunct is grammatically feminine but may be semantically construed as animate, as in (6).<sup>9</sup>

	- b. Pes dog.ma.sg a and kočka cat.f.sg byli were.pp.ma.pl unavení. tired.pp.ma.pl 'The dog and the cat were tired.'

ma + f = **ma (animate)**

<sup>9</sup>The consistency of masculine animate agreement in these patterns have been confirmed in Adam (2017), a large scale (*N* = 103) elicitation study testing some of the data from an unpublished version of this paper. Adam tested only animate coordinations and confirmed that whenever one of the conjuncts is masculine animate, plural agreement is masculine animate, irrespective of the order of the conjuncts.

### 8 On the lack of φ-feature resolution in DP coordinations

If there is no masculine animate gender, grammatically inanimate gender combinations are strongly degraded even if they semantically denote animate objects. When the coordination contains an inanimate masculine noun and a neuter, the expected agreement, i.e., masculine inanimate (syncretic with feminine plural), is stongly degraded, (7). Speakers I tested strongly preferred colloquial morphology (Common Czech), which is completely syncretic in plural, i.e., no gender or animacy distinction is preserved in the system (e.g., Karlík et al. 2002: 76), (8).


(8) Dobytek cattle.mi.sg a and kotě kitten.n.sg byly were.pp.pl unavený. tired.pp.pl 'The cattle and the kitten were tired.'

As for the combination of feminine and neuter, no agreement pattern is fully acceptable either. In a forced written elicitation task reported in Adam (2017), speakers volunteered feminine plural (62%), i.e., the prescriptively required agreement, neuter plural (about 26%), i.e., syncretic plural in some dialects, or colloquial morphology (12%), i.e., fully syncretic agreement, (9). In my original data collection which was based on a spoken elicitation and a grammatical judgement task, speakers found the colloquial ending most acceptable.<sup>10</sup>

(9) ⁇ Kočka cat.f.sg a and kotě kitten.n.sg byly were.pp.f.pl unavené tired.pp.f.pl / unavená tired.pp.n.pl / unavený. tired.pp.colloq-pl

'The cat and the kitten were tired.' f + n = **⁇f/n/colloq (gender)**

Strikingly, when speakers are inquired about a combination of masculine animate and neuter gender, irrespective of the number of the conjuncts, as in (10), they try to avoid the agreement altogether. The switch to the fully syncretic colloquial morphology improves the ratings but not as well as in (8). I label this class of avoidant judgements as agreement gaps and mark them with ⊛.

<sup>10</sup>Adam's study was based on data reported in the 2017 manuscript version of this paper. The judgements reported here thus reflect her finding. Adam didn't test any of the other feature combinations as her focus was on animate agreement and agreement with numerals.

### Ivona Kučerová

(10) a. ⊛ Pes dog.ma.sg a and kotě kitten.n.sg byli were.ma.pl ⁇{unavené tired.pp.mi/f.pl / unavení pp.ma.pl / unavená}. pp.n.pl Intended: 'The dog and the kitten were tired.' ma.sg + n = ⊛ b. ⊛ Psi dogs.ma.pl a and koťata kitten.n.sg byli were.ma.pl ⁇{unavené tired.pp.mi/f.pl / unavení pp.ma.pl / unavená}. pp.n.pl Intended: 'The dogs and the kittens were tired.' ma.pl + n.pl = ⊛

Recall that the past-tense pattern was not fully syncretic, yet the feature resolution was always possible.<sup>11</sup> Furthermore, if indeed some form of morphological syncretism is in place, then the resolution pattern cannot be attributed to a narrow-syntax valuation as suggested in the existing literature.

To summarize, the fact that the gender-resolution pattern does not extend to the predicative-adjective and passive agreement shows clearly that whatever the process behind the seeming feature resolution is, it cannot be a result of narrowsyntax-feature valuation as part of agree with more than one accessible goal. Next section proposes a theoretical alternative.

### **4 You are what you probe**

If mismatched gender features on conjuncts were syntactically resolved within a conjunction phrase (ConjP), agreement with such a phrase should always realize the same features. As we have seen in the previous section, this prediction is not borne out. I argue that instead the resolution pattern depends on the unvalued features of the probe. In order to account for the data I propose the following generalization: If the value of the gender feature on the first conjunct and the value of the gender feature on the second conjunct do not match, feature resolution depends on whether the probe probes (a) only for gender (and number), or (b) whether it probes for person. If the probe (here, verbal predicate, including the past tense formation) probes for a valued person feature, we observe a resolution along an animacy scale. We saw this pattern in §2. If, however, the probe probes

<sup>11</sup>One could argue that the difference between the animate ending *-i* and the gender-plural ending *-y* is no longer preserved in modern Czech as the original phonological distinction does not exist anymore, i.e., the corresponding past tense forms are homonyms. Yet, the neuter plural ending is clearly distinct which makes a syncretic explanation untenable.

### 8 On the lack of φ-feature resolution in DP coordinations

only for a valued gender feature, i.e., there is no unvalued person feature on the probe, feature resolution is severely limited and may even yield agreement gaps. This is the pattern we saw in §3. The question that arises is why the apparent feature resolution plays out differently for different probes.

We already concluded that a gender-feature resolution as part of narrow-syntax-agree valuation cannot be the answer. In order to understand the pattern we need to turn to the question of how the label of a conjunction phrase and its corresponding features are determined. I.e., the proposed analysis will implement two factors: unvalued features on the probe and the feature composition of the label of the conjunction phrase.

In a nutshell, I argue that the label of the conjunction phrase is determined in the syntax-semantics interface, and the labelling process is analogical to the feature resolution attested in split-antecedent pronouns (Heim 2008, Sudo 2012), i.e., plural pronouns that simultaneously refer to more than one antecedent (e.g., *you* and *I* gives *we*), provides an explicit algorithm for how the features of the referring antecedent are computed from a mixed-feature input. In the present proposal, the actual agreement is then modelled as a narrow-syntax agree that targets the conjunction-phrase label as the syntactic representation of the conjunction phrase, where label is a syntactic representation of all features present in the corresponding extended projection and relevant for next syntactic building. There are three components: First, agree is successful only if the label provides features that match the features of the probe. Second, following Sudo (2012), I assume that the syntax-semantics interface manipulates semantic indices (i.e. numerical pointers). Crucially, indices are complex structures, enriched by person, gender, and number information. Third, this complex-index information can be mapped onto morphology. Fourth, morphology can only realize features uniquely determined and valued by the label of the probe. The consequence is that if agree probes for person, agreement reflects the complex features of the indices. If agree probes for gender, it can only use gender features available to the narrow syntax component. In other words, while semantics can build new objects (complex indices), syntax can only copy existing values of features. Consequently, if agree probes for person, it can used the complex structures built by the syntax-semantics interface. If agree probes for gender, it may only use features already present in narrow syntax.

### **4.1 Features of the conjunction-phrase label**

The idea that there is a connection between agreement with coordinated noun phrases and features of split-antecedent pronouns is intellectually indebted to

### Ivona Kučerová

Farkaş & Zec (1995) that proposed a striking generalization, namely, that the morphological features on the predicate agreeing with a nominal conjunction are always identical to the morphological features of a pronoun anaphorically referring to the same coordination. To implement this idea I follow Heim (2008) in her treatment of split-antecedent pronouns and Sudo (2012) in his treatment of complex indices underlying the morphological representation of split-antecedent pronouns.

Furthermore, I follow Narita (2011) and Chomsky (2013) in that labelling is a process triggered by the semantic interface (CI) and argue that the person feature is crucial in the labelling process in that it provides a formal connection between narrow syntax (person as a syntactic feature) and semantic representation (person mapped on a (referential) index; Longobardi 2008, Sudo 2012, Landau 2010, among others). The connection arises via implementing the person feature as [±participant] (Nevins (2007) and the literature cited there). Furthermore, I follow the literature on coordination that argues that the plurality of a nominal conjunction is computed as semantic plurality (Munn 1993, Bošković 2009, Bhatt & Walkow 2013). Technically, I implement a semantic plurality as a conjunction of person features, more precisely, semantic plurality is a conjunction of nonmatching indices based on the person feature.

For concreteness, I assume the person-feature hierarchy and its morphological mapping as exemplified in Figure 1. Note that the implementation via the [±participant] feature lends itself easily to accounting for the intrinsic marking of animacy that is critical for the empirical pattern at hand. Next subsection provides a detailed derivation of the attested patterns.

### **4.2 Accounting for the resolution pattern**

The first case to consider is the agreement patterns in which the probe probes for a person feature (the data discussed in §2). Based on the person-feature geometry in Figure 1there are three basic cases to consider based on the label of the conjunction phrase: (a) there is a [+person] feature, valued as [+participant], (b) there is a [+person] feature, valued as [−participant], and (c) there is [−person] feature. As for number, throughout the section I assume that both conjuncts will associate with an index and that the indices will not be identical. The assumption that semantic plurality corresponds to a conjunction of non-matching indices is motivated by examples such as that in (11). Consequently, the semantic number will be set as plurality and morpho-syntactically will correspond to a valued number feature (technically, [−sg]).

### 8 On the lack of φ-feature resolution in DP coordinations

Figure 1: Feature hierarchy & morphological mapping (modelled after Harley & Ritter 2002 and Bartošová & Kučerová 2015)


The first case to consider is a case in which the conjunction label will contain the features [+person] feature, valued as [+participant], and [−sg]. I argue that this labelling arises whenever one of the conjuncts is syntactically valued as masculine animate. The reason is that the masculine animate valuation corresponds to the [+person] feature, valued as [+participant]. Since the labelling operation takes place at the syntax-semantics interfaces, the system minimally searches the embedded structure for features binding to the semantic component. Which is to say, if there is a [+person] feature and if there is a [+participant] in the searchable domain, these features must be copied (technically, identity-merged) into the label of the conjunction phrase. Consequently, irrespective of the features of the other conjunct, the labelling reflects the presence of the semantically marked features. In turn, morphology copies the feature combination onto the plural animate agreement (traditionally called masculine inanimate plural) (see Bhatt & Walkow 2013 for an argument in favour of agreement as morphological copying). This configuration is exemplified in (12), repeated from (2) above. Notice that the morphological realization does not recognize masculine animate feature as such but it solely realizes the valued [+participant] feature in the plural context.

### Ivona Kučerová

(12) {Kočka cat.f.sg / kotě kitten.n.sg / dobytek} cattle.mi.sg a and pes dog.**ma** jedli ate.pp.**ma**.pl ze from stejné same misky. bowl

'The cat/kitten and the dog ate from the same bowl.'

f/n/mi + **ma** = **ma (animate)**

Let us now consider the next case which is the case when there is a [+person] feature but no [+participant] feature in the label of the conjunction phrase. This case arises if none of the conjuncts is syntactically valued as masculine animate but one or both conjuncts are syntactically valued as masculine inanimate or feminine. Since the label is marked as [−participant], the morphological realization resorts to the gender marking in the context of plural, i.e., the syncretic morphology for masculine-inanimate plural and feminine plural. This feature combination is exemplified by (3), repeated below as (13).

(13) Kotě kitten.n.sg a and {kočka cat.**f**.sg / dobytek} cattle.**mi**.sg jedly ate.pp.**{mi/f}.pl** ze from stejné same misky. bowl 'The kitten and the cat/cattle ate from the same bowl.'

n + **mi/f** = **{mi/f} (gender)**

Now we can finally turn to the last case which is a probe probing for person but with none of the conjuncts specified for a [+person] feature. Consequently, there is no participant-feature specification in the label of the conjunction phrase. This configuration arises when both conjuncts are in neuter. Note that according to the feature geometry in Figure 1, neuter is syntactically not a gender feature but it arises as a realization of the [−person] feature. In turn, plural neuter cannot be systematically computed from the label of a coordination that refers only to person. Instead, the lack of positive valuation within the syntactic component means that the morphological realization must resort to the default gender realization (technically, failed agree, Preminger 2009). In Czech this means that morphology realizes the plural agreement as the syncretic plural gender form (mi/f). This feature combination is exemplified by (5a) and (5b), repeated below as (14) and (15).

(14) Kotě kitten.n.sg a and štěně puppy.n.sg {jedly ate.pp.{mi/f}.pl / \*jedla} pp.n.pl ze from stejné same misky. bowl 'The kitten and the puppy ate from the same bowl.'

n.sg + n.sg = **{mi/f} (gender)**

### 8 On the lack of φ-feature resolution in DP coordinations

(15) Koťata kittens.n.pl a and štěně puppy.n.sg {jedly ate.pp.{mi/f}.pl / \*jedla} pp.n.pl ze from stejné same misky. bowl 'The kittens and the puppy ate from the same bowl.'

n.pl + n.sg = **{mi/f} (gender)**

The problem we just identified lies in the combinatorics behind the labelling operation. There is a caveat though. While the probe in these cases needs to be valued for person, it morphologically realizes gender features. Which is to say, if the label is uniquely labelled for gender from syntax, then morphology could realize the gender feature in and of itself. However, I argue that this may happen only if the syntactic features on both conjuncts are identical, i.e., only if narrow syntax provides n.pl as the common feature of the conjuncts. If this is the case, no feature calculation is necessary and the system solely copies the neuter plural label of its parts into the label of the conjunction phrase and this information determines the morphological mapping of the resulting agreement as neuter plural. This is the pattern we saw in (5c), repeated below as (16).

(16) Koťata kittens.n.pl a and štěňata puppies.n.pl jedla ate.pp.n.pl ze from stejné same misky. bowl 'The kittens and the puppies ate from the same bowl.' n.pl + n.pl = **n.pl**

The behaviour of neuter is crucial for our understanding of the overall system. Notice that there is no optionality in (16). Which is to say, if syntax can uniquely derive the values of syntactic features of the conjunction phrase label, agree must respect these values. If, however, syntax cannot uniquely derive these values (in our cases, because there is a feature-valuation conflict), then morphology refers to the features of indices derived by the syntax-semantics interface as the only available structural information.

We have successfully derived the complete pattern of the seeming genderfeature resolution by referring only to the person feature. Table 2 summarizes the features in the label that were relevant in the process and the morphological mapping they triggered.

### **4.3 Accounting for the resolution failure**

Let us now turn to the data pattern discussed in §3, i.e., the pattern in which the probe does not have any unvalued person feature but probes for a gender feature instead. While the derivational procedure described in the previous subsection crucially relies on the ability of the syntax–semantics interface to construct a

### Ivona Kučerová

Table 2: Labelling of the conjunction phrase and morphological realization: probe = person


complex semantic index from the person representation in the label of the conjunction phrase, a probe that probes only for gender cannot use this complex information but must rely on the syntactically present valuation of gender. In turn, we expect the agreement patterns to play out differently.

Before we proceed to the individual patterns, let us consider the geometry of the gender features. According to the feature-geometry of person proposed in Figure 1, only the masculine inanimate and feminine feature correspond to a binary gender feature. Masculine animate corresponds to a morphological realization in the context of [+participant]. The syncretic masculine inanimate and feminine plural is a default realization of the [+person] feature, i.e., without a [+participant] feature. We have also seen that although neuter should in principle appear in the context of [−person], it does not, as it only can be copied. The core difference between the cases discussed in the previous subsection and the cases discussed in this subsection is that in the previous cases distinct values of person and participant features have been resolvable in the process of the complex index formation. The features that were used to value the unvalued person of the probe were indeed features that were mediated by the formation of the complex semantic index. The question is what happens, if there is no uniform person representation mediated by the complex-semantic-index formation?

### 8 On the lack of φ-feature resolution in DP coordinations

I argue that in such a case, an unvalued gender feature on the probe can be valued only if the conjuncts share the features relevant for the morphological mapping procedure. It follows that agreement will be successful only if (a) there is no mismatch of gender features (the trivial case) or (b) both conjuncts are [+participant]. All other combinations should be degraded. This prediction is borne out.

As we already saw, if one conjunct is masculine animate, the other conjunct must also also masculine animate, or feminine that can semantically be construed as animate. This follows from the restriction that both conjuncts must be [+participant], that is animate, as only animate entities can be modelled as participants. Consequently, in this feature combination, the plural agreement is animate, i.e., morphologically realized as ma. An example of this feature interaction is given in (6), repeated below as (17).

	- b. Pes dog.ma.sg a and kočka cat.f.sg byli were.pp.ma.pl unavení. tired.pp.ma.pl 'The dog and the cat were tired.'

ma + f = **ma (animate)**

Note that in this case, although there is no uniform gender feature in the label, the shared [+participant] feature is sufficient for the derivation to converge.

If gender is not specified for animacy, there is no feature information in the label of the conjunction phrase that could be used to value the gender feature on the probe. There are two cases to consider. If there is no [+participant] feature, the combination is degraded but the speakers have an intuition what the best form would be. I argue this is because there is no valuation in syntax. Yet, the speakers can use their knowledge of what the feature formation would be if there was a person feature as a formal mediator. In other words, this is a case of syntactic valuation failure, with a partial rescue by morphology. An example of this combination is in (7)–(9), repeated below as (18)–(20).


Ivona Kučerová


The more interesting case is the case when the label combines a [+person] and a [−person] feature. Without the complex semantic index being computed and used to value a person on the probe, speakers clearly lack any indication of what the morphological mapping should be. In turn, there is no morphological form that could save the failed syntactic valuation. This is what underlies the agreement gaps we saw in (10), repeated below as (21).


### **5 Predictions**

The core property of the system proposed in the previous section is that agreement with coordinated noun phrases is always mediated by the label of the conjunction phrase. Crucially, we saw that some agreement combinations cannot be resolved because of a problem with valuation of the agree probe because the label of the conjunction phrase has not been uniquely resolved. Interestingly, in the domain of agreement gaps, we saw that even if there is a good morphological match, the lack of successful valuation yields agreement failure. Consequently,

### 8 On the lack of φ-feature resolution in DP coordinations

if this reasoning is correct, we expect to find problems with valuation elsewhere. This section investigates two empirical domains that confirm this prediction.

Let us start with agreement gaps. If agreement gaps result from problems of labelling, i.e., from the fact there is no unique feature in the label that could value an unvalued feature of the probe, we expect to find agreement gaps elsewhere. This prediction is born out in comitative constructions and first-conjunct agreement constructions. Although in comitative constructions only one conjunct is in nominative, agreement is with both conjuncts. Which means the agreement must be based on the features of the label of the conjunction phrase. Consequently, we expect agreement gaps to arise exactly in the same environment as with regular coordinated phrases. Which is to say, we expect agreement gaps whenever the probe does not probe for person but only for gender, and whenever the conjuncts do not share gender features or are not both marked as [+participant]. This prediction is borne out, as can be seen, for example, in (22).

(22) ⁇ Pes dog.nom.ma.sg s with kotětem kitten.instr.n.sg byli were.pp.ma.pl unavení tired.ma.pl / unavené {mi/f}.pl / unavená. n.pl Intended: 'The dog and the kitten were tired.'

Interestingly, even if the predicate morphologically agrees only with the first conjunct, we predict that the adjectival agreement should be ungrammatical if the conjunction phrase cannot be uniquely labelled. This prediction follows if the morphological realization of agreement is post-syntactic but agree targets the label of the conjunction phrase. As the example in (23) demonstrates, this prediction is indeed borne out. To my knowledge no current theory of first-conjunct agreement predicts (23) to be ungrammatical.

(23) \* Byl was.pp.m.sg unaven tired.m.sg pes dog.nom.ma.sg a and kotě. kitten.n.sg Intended: 'The dog and the kitten were tired.'

Let us now turn to the second group of predictions. Without saying it explicitly, I assumed throughout the paper that the predicates probe only after the conjunction phrase was spelled-out. This assumption follows from the fact that the relevant notion of labelling is a process that takes at the syntax-semantics interface, which is to say, it is part of the spell-out procedure. The prediction then is clear: only elements that probe after the spell-out of the conjunction phrase can agree with both conjuncts. The reason is that without the label, there is no

### Ivona Kučerová

syntactic representation of the conjunction phrase that would combine features of both conjuncts. This prediction is borne out as well, as can be demonstrated on two agreement patterns.

First, if an adjectival adjunct modifies a conjunction, it must be syntactically adjoined before the conjoined phrase is spelled-out. Consequently, even a conjunct that semantically modifies both conjuncts must morphosyntactically agree with only one of the conjuncts. The example in (24) demonstrates this point. Although the adjective 'young' may semantically modify only the man or it may modify both the man and the woman, it must agree only with the first conjunct. The plural agreement is ungrammatical.

(24) {\*mladí young.ma.pl / mladý} m.sg muž man.ma.sg a and žena woman.f.sg 'a young man and a young woman' or 'a young man and a woman'

This point can be further strengthened by the following fact. In Czech, determiners that semantically select for plurality cannot modify a conjunction of singular individuals. Thus, for example, *oba* 'both' is ungrammatical within a conjunction phrase, as shown in (25).

(25) ⊛ \*{oba both.mi / obě} both.f/n.pl kočka cat.f.sg a and kotě kitten.n.sg Intended: 'both the cat and kitten'

### **6 Conclusions**

This paper contributes to our understanding of syntactic agree and its morphological realizations in four important respects. First, I presented an argument that narrow syntax cannot resolve a conflicting feature valuation. Syntax can only copy and share. Second, patterns that seem to involve some form or feature resolution are mediated by feature resolution at the syntax-semantics interface. Concretely, I argued that feature resolution arises only as part of semantic index formation, dependent on person-feature representation in narrow syntax. Third, I provided an empirical argument that labelling conflicts are fatal to feature valuation as agree. There is no morphological rescue. Fourth, I demonstrated that morphological features realized on agreeing elements do not have to faithfully match the underlying bundle of syntactic features. Although the final conclusion is not surprising in the light of the work done in the Distributed morphology

### 8 On the lack of φ-feature resolution in DP coordinations

framework, it raises non-trivial questions about the empirical accuracy of generalizations in the domain of agreement.

The core argument presented in the paper relies on the very existence of combinations of features that cannot be syntactically resolved. The fact that there exist combinations that cannot be syntactically resolved in and of itself provides sufficient evidence that there cannot be a default syntactic mechanism that would underlie the seeming resolution patterns. Interestingly, as pointed out by two anonymous reviewers, there are naturally attested examples with seemingly parallel combinations of features that are perceived by native speakers as more acceptable or fully acceptable. Which is to say, there appear to be agreement strategies that go beyond the mechanics proposed in this paper. Providing an exhaustive description and a theoretical account of agreement resolution patterns in Czech dialects and Slavic in general goes beyond the present work. Yet I would like to conclude the paper with a couple of observations about the possible nature of the attested variation and its underpinning.

The data brought by the anonymous reviewers seem to fall into two groups: examples from colloquial Czech (dialects attested in the eastern part of the Czech Republic), as in (26), and examples with human participants, as in (27).


The data and judgements presented in this paper come from Standard Czech, a prescriptive variety, that overlaps in the relevant morphological features with eastern Moravian dialects (e.g., Karlík et al. 2002: 401–404). Speakers of these dialects typically have the same or similar type of morphological syncretism and range of morpho-syntactic features as preserved in Standard Czech. Speakers of western dialects or Prague-centered colloquial varieties often lack the full range of distinct morpho-syntactic patterns. One might wonder whether the distinct

### Ivona Kučerová

morphological syncretism underlies examples such as that in (26). If that was the case, examples of this sort would provide a challenge to the present proposal.

We know, however, that the variation in agreement goes beyond morphological syncretism. The dialects fundamentally vary in their semantic index representation, as attested by differences in binding. Consider the example in (28).

(28) % Petr*<sup>i</sup>* Petr má has rád liked jeho*<sup>i</sup>* his matku. mother 'Peter likes his mother.'

While (28) yields a severe Principle B violation in Standard Czech and Moravian dialects, it is fully acceptable in some Bohemian dialects (Jakub Dotlačil, p.c.). If, indeed, there is a connection between a person-feature resolution and semantic index representation and if differences in binding follow from differences in index representations (Heim 1998, Roelofsen 2008), it is not altogether surprising that we might find distinct resolution patterns. The same point applies to the inter-Slavic variation as reported in Corbett (1983) and much subsequent work. We know that agreement resolution varies in Slavic dialects. But equally there is an insufficiently studied variation in binding (e.g., Nikolaeva 2014).

The other point concerns an effect of humanness. It seems that at least in some cases replacing a non-human animate DP with a human-denoting animate DP improves the resolution pattern. We know independently that humanness closely interacts with a person representation (e.g., Ritter 2014; Wiltschko & Ritter 2015). It is possible that we see a related effect here as well.

A closer investigation of these intriguing patterns must, however, await future research.

### **Abbreviations**


8 On the lack of φ-feature resolution in DP coordinations

### **Acknowledgements**

This research would not have been possible without funding from the Social Sciences and Humanities Research Council Insight Grants #435-2012-1567 and #435- 2016-1034 (PI: I. Kučerová). Thanks the audiences at NELS, Amherst, MA, and FDSL, Berlin for their questions and suggestions. Special thanks go to Betsy Ritter, Adam Szczegielniak, and Nick Welch for extensive discussion of the data and the analysis with me. The remaining errors are mine.

### **References**


### Ivona Kučerová


### 8 On the lack of φ-feature resolution in DP coordinations


## **Chapter 9**

## **Surviving sluicing**

Franc Marušič University of Nova Gorica

Petra Mišmaš University of Nova Gorica

Vesna Plesničar University of Nova Gorica

Tina Šuligoj University of Nova Gorica

> In this paper, we discuss examples of sluicing in Slovenian in which, in addition to a wh-phrase (or wh-phrases in instances of multiple sluicing) discourse particles appear. This is unexpected given Merchant's (2001) Sluicing-COMP generalization, as already observed in Marušič et al. (2015), even though there are several languages in which similar cases exist, e.g. German. In this paper we focus on discourse particles *pa* and *že* in (multiple) wh-questions and sluicing. These examples are not only important for our understanding of sluicing but are also crucial for analyzing discourse particles in Slovenian. Based on examples with sluicing and discourse particles in Slovenian, we argue against positioning these particles within the whphrase, clitic cluster or the IP.

**Keywords:** Slovenian, sluicing, particles, sluicing-COMP generalization

### **1 Introduction**

In this paper we address the phenomenon already discussed in Marušič et al. (2015), i.e. cases in which in addition to a wh-phrase a discourse particle appears

Franc Marušič, Petra Mišmaš, Vesna Plesničar & Tina Šuligoj. 2018. Surviving sluicing. In Denisa Lenertová, Roland Meyer, Radek Šimík & Luka Szucsich (eds.), *Advances in formal Slavic linguistics 2016*, 193–215. Berlin: Language Science Press. DOI:10.5281/zenodo.2545523

### Franc Marušič, Petra Mišmaš, Vesna Plesničar & Tina Šuligoj

in sluicing in Slovenian. These cases are unexpected given the standard understanding of sluicing, which is in Ross (1969) described as deletion of parts of the embedded question that are identical to some part of the antecedent clause, leaving only the wh-phrase, as shown in (1).

(1) I heard somebody, but I don't know [who I heard].

Despite Ross' definition of sluicing as a phenomenon in embedded clauses, today, sluicing is taken to be a type of ellipsis phenomenon "in which the sentential portion of a constituent question is elided, leaving only a wh-phrase remnant" (Merchant 2006: 271) which can occur in embedded or root causes. In what follows, all Slovenian examples will be cases with sluicing in root clauses.<sup>1</sup>

The insight that only the wh-phrase remnant appears in sluicing is formalized in Merchant's Sluicing-COMP generalization, given in (2), in which "operator" stands for "syntactic wh-XP", "material" for any pronounced element, and "COMP" for "material dominated by CP but external to IP" (Merchant 2001, 62). Given a standard understanding of what CP represents, if one assumes the expanded left periphery à la Rizzi (1997), we assume this generalization was meant to be read as follows: In sluicing only wh-phrases survive ellipsis as they are the only elements occupying the left periphery. Apart from the wh-phrase, the left periphery does not contain any overt elements.<sup>2</sup>

### (2) **Sluicing-COMP generalization**

In sluicing, no non-operator material may appear in COMP.

(Merchant 2001: 62, (71))

Given this, as observed in Marušič et al. (2015), examples such as (3) are unexpected. In all examples given in (3), non-wh-material survives sluicing:<sup>3</sup>

(i) Vid Vid je aux nekoga someone srečal. met.acc Ne not vem, know koga who.acc (\*pa). ptcl 'Vid met someone. I don't know who.'

<sup>1</sup>While sluicing also exist in embedded clauses in Slovenian, as (i) shows, examples with discourse particles, which we are looking at in this paper, are limited to root clauses. This is not surprising, given that discourse particles are typically a root clause phenomenon, which is related to their relation to both the illocutionary force and sentence type of the clause (Bayer & Obenauer 2011: 452).

<sup>2</sup>This seems exactly what Merchant's informal explanation of his generalization says: "The claim is that only segments directly associated with the syntactic operator – the wh-XP – will be found overtly in sluiced interrogatives." (Merchant 2001: 62)

<sup>3</sup>Wh-elements in Slovenian contain the wh-morpheme *k-/č*-, the particles, however, do not (cf. Marušič et al. 2015).

9 Surviving sluicing

	- b. Vid Vid je aux srečal met nekoga. someone. Koga Who že? ptcl 'Vid met someone. Remind me, who <did he meet>?'
	- c. Vid Vid je aux srečal met Janeza. Janez. Koga Who še? ptcl 'Vid met someone. Who else <did he meet>?'
	- d. Vid Vid je aux srečal met nekoga. someone. Koga Who to? ptcl 'Vid met someone. Who <did he meet>?'
	- e. Vid Vid je aux srečal met nekoga. someone. Koga Who spet? again 'Vid met someone. Who (are you saying again) <did he meet>?'
	- f. Vid Vid je aux srečal met nekoga. someone. Koga Who pa ptcl to? ptcl 'Vid met someone. Who <did he meet>?'
	- g. Vid Vid je aux srečal met Ano Ana pa and še also nekoga. someone. Koga Who pa ptcl še? ptcl 'Vid met Ana and someone else. Who else <did he meet>?'

As wh-phrases in sluicing can also be complex, as in (4), one can imagine these discourse particles that follow wh-words in (3) could also be part of a complex wh-phrase.

	- B: Katero which punco? girl 'Which girl?'

As shown in Marušič et al. (2015) these particles do not form a constituent with the wh-material, but are rather a part of the extended left periphery (in the sense of Rizzi 1997) that is not elided in sluicing in Slovenian.<sup>4</sup>

<sup>4</sup>Note that examples in (3) are also not instances of swiping (Merchant 2002), as these particles are not prepositions, or spading (van Craenenbroeck 2010), as these particles are not demonstrative pronouns.

### Franc Marušič, Petra Mišmaš, Vesna Plesničar & Tina Šuligoj

In this paper we look at the sluicing examples with particles more closely in order to better understand where exactly particles are located and where they originate. We present new arguments against placing particles inside wh-phrases and show that particles really are part of the left periphery and thus offer further support for the analysis according to which non-wh-material in the left periphery does not have to be elided in sluicing in Slovenian (Marušič et al. 2015).

We start with an assumption that sluicing is the ellipsis of the IP portion of a constituent question (Ross 1969; Merchant 2001; 2006), which means the examples in (3) are parallel to the examples in (5).<sup>5</sup> Based on this, we discuss the role of discourse particles in both wh-questions and sluicing. From now on, we gloss the two Slovenian particles under discussion as pa and že.

	- b. Koga who.acc že že je aux Peter Peter.nom videl? saw 'Who did Peter see?'

While examples in (3) show that there are several discourse particles that can appear in sluicing in Slovenian, we focus only on discourse particles *že* and *pa* here. Some initial thoughts on other particles in sluicing can be found in Marušič et al. (2015), but since the role of Slovenian discourse particles in wh-questions has previously not been sufficiently described, we start off by showing some properties these elements display when they are used as discourse particles (and not topic or focus particles). Both *pa* and *že* have many different uses and meanings, which we will discuss in §2. In §3 we take examples with sluicing and the particles *pa* and *že* to give new arguments both against positioning these particles within the wh-phrase and to show that in addition to (complex) wh-phrases nonwh-material can also survive sluicing in Slovenian. In §4 we discuss the position of discourse particles with respect to the clitic cluster and the adverbs in the IP to show that discourse particles appear in the left periphery, higher than IP adverbs, confirming the earlier proposal by Marušič et al. (2015). §5 concludes the paper.

<sup>5</sup>We are avoiding the debate on the nature of the ellipsis site in sluicing, particularly whether it has the same exact structure as the antecedent (which is what we adopt, following Ross 1969, Merchant 2001, and others) or whether it is structurally empty with its content being supplied by re-using syntactic structure from some accessible point elsewhere in the discourse (which is what Chung et al. 1995, 2011, among others are arguing for). Our data do seem to favor the approach we are adopting, but we do not want to go into this discussion here.

9 Surviving sluicing

### **2 Discourse particles** *že* **and** *pa* **in Slovenian wh-questions**

In general particles *že* and *pa* (as do other discourse particles in Slovenian) display some properties that are typically found with discourse particles cross-linguistically. For example, as Zimmermann (2011) notes, discourse particles carry more than one function and can also be used as focus particles, discourse markers (i.e. markers that establish coherence in the discourse) or adverbials.<sup>6</sup> This also holds for *pa* and *že*, e.g. *že* is also an aspectual adverb. Furthermore, discourse particles in Slovenian are optional (as in other languages, see Bayer & Obenauer 2011 for German), to a certain extent various discourse particles can appear simultaneously in the clause, they are sensitive to clause type, they normally do not bare stress, and are monosyllabic. And perhaps most importantly, discourse particles do not modify the proposition, but rather the utterance (Bayer & Obenauer 2011) as they express speakers' attitude towards the utterance (Zimmermann 2011). To further show properties of particles *že* and *pa* in Slovenian, we discuss them separately in this section.

### **2.1** *Že* **as a discourse particle**

Etymologically the origin of Slovenian particle *že* is closely related to the morpheme *-r* that one finds in relative pronouns in Slovenian (*kdor* 'who', *kar* 'what', *kjer* 'where'). Both are etymologically related with the Indo-European particle *\*g<sup>h</sup> e/\*g<sup>h</sup> o* that has developed into particles in several Slavic languages, for example *že* in Russian (see Hagstrom & McCoy 2003 for its interpretation in whquestions), and *že* in Czech (Gruet-Skrabalova 2012) (cf. Mitrović 2016). But despite the common source, languages differ with respect to the actual meaning of *že*.

For example, Gruet-Skrabalova (2012) shows that the Czech *že* is a complementizer that can be used in declarative and interrogative clauses. In embedded contexts *že* combines with the declarative clause and it marks syntactic dependence of embedded clause, but *že* also triggers an echo-interpretation in Czech.<sup>7</sup> That is, in wh-questions, following Gruet-Skrabalova (2012) *že* indicates that the speaker has not heard or that (s)he refuses to accept a previous utterance. For example, (6) is used to check whether the part of the utterance asked by the whword was asserted in the previous context, see Gruet-Skrabalova (2012) for more on Czech *že*.

<sup>6</sup> For further differences between discourse markers and discourse particles see, for example, Zimmermann (2011).

<sup>7</sup>Czech particle *že* is in many respects similar to Slovenian *da* 'that', which can be used as a complementizer or a discourse particle, see Marušič et al. (2015) for more on this topic.

Franc Marušič, Petra Mišmaš, Vesna Plesničar & Tina Šuligoj

	- B: Přece indeed do to restaurace! restaurant '(I said he went) to a restaurant.' (Czech; Gruet-Skrabalova 2012: 5, (10))

As McCoy (2003) observes, Russian *že* can function as a modal/affective particle, focus marker, marker of contrastive focus, emphasis marker, thematic/ organizational/ textual *že*, marker of (re-)activated information, and marker of a reference point in the activated domain of reference. Hagstrom & McCoy (2003) and McCoy (2003) observe the following contexts with distinctive occurrences of *že* in Russian: *yes-no* questions, wh-questions, statements with phrasal scope and statements with sentential/propositional scope. Example (7) shows the use of *že* in a wh-question. Crucially, it depicts a situation where the child wants to sleep in the morning, which seems unreasonable to the mother, as the child is not supposed to have a reason to feel sleepy at that time. That is, example (7) is a rhetorical question, where *že* roughly corresponds to the English *in the world*. As shown in (8), *že* can be used in Slovenian in a similar way.<sup>8</sup>


While (8) already shows one meaning of the particle *že*, *že* most commonly appears as an aspectual adverb meaning 'already', as shown in (9). Using *že* as an

<sup>8</sup>As McCoy (2003) observes, in these rhetorical questions the speaker does not expect a possible reasonable/true answer. Moreover, her conclusion is that *že* in wh-questions applies to every member of set of contextually accessible answers to the question, generating presupposition for each proposition in the set. Therefore, Russian *že* in wh-questions generates presupposition that the possible answers from the set in question have already been evaluated as false and the same applies to Slovenian *že* under conditions as presented in (8).

### 9 Surviving sluicing

aspectual adverb is very common, and while in some cases *že* can receive this interpretation in addition to the discourse particle reading, this is not directly relevant for the present discussion.<sup>9</sup>

(9) Peter Peter je aux že že šel went na on počitnice. vacation 'Peter has already left for the vacation.'

While we can also find the aspectual meaning in wh-questions, the use of *že* in sluicing or a wh-question more importantly indicates that the speaker knows the answer to the question but does not remember it. We will refer to this reading as the 'remind-me' reading, following Sauerland & Yatsushiro (2014), and will use *že*-r to refer to the morpheme carrying this meaning. The morpheme carrying the aspectual reading will be referred to as *že*-a. The former reading is apparent in the following scenario. Imagine we visit our friend Peter in April, but his mother tells us he is not home, we remember that he is never home in the spring and we actually know where he always travels in the spring, but at the moment we cannot recall where he travels. We ask his mother the question in (10) as a 'remind-me' question.

(10) Kam where že že hodi goes vsako every leto? year '(Remind me) Where does he go every year?'

This meaning is possible in wh-questions and in sluicing, while *že* in *yes/no*questions (or in declarative sentences), such as (11), can receive the aspectual reading, but not the 'remind-me' reading.

(11) A q je aux že že opral washed obleke? clothes Available: 'Did he already wash the clothes?' Unavailable: '(Remind me) Did he was the clothes?'

	- B: Že že že, že a but ne not vem, know kdo who jih it.acc je aux zlikal. ironed 'True, but I don't know who ironed them.'

<sup>9</sup>There are also other meanings, for example, *že* can be used to express agreement with a statement:

### Franc Marušič, Petra Mišmaš, Vesna Plesničar & Tina Šuligoj

Interestingly, as shown in (12b), both the 'remind-me' interpretation and the aspectual reading of *že* are available when *že* and the wh-word are not adjacent. In relation to this, two things need to be noted. First, the availability of *že*-r in (12b) implies that *kaj* 'what' and *že*-r do not necessarily form a constituent, as clitics do not split syntactic constituents in Slovenian.<sup>10</sup> Second, when *že* precedes the auxiliary clitic, only the 'remind-me' reading is available.<sup>11</sup>

	- b. Kdo who je aux že že naslikal painted Guernico? Guernica.acc Available: '(Remind me) who painted Guernica?' Available: 'Who already painted Guernica?'

In a wh-question *že*-r follows the wh-phrase. Examples in which *že*-r precedes the wh-word are unacceptable, as wh-phrases need to appear in a clause initial position in Slovenian wh-questions, see Mišmaš (2016).

(13) \* Že že kdo who je aux naslikal painted Guernico? Guernica Intended: '(Remind me) who painted Guernica?

In sluicing, *že* can only receive the 'remind-me' reading, as (14) shows. That is, (14) can be used in a context where the speaker is playing a game, where (s)he

	- b. \*Poletni je dež prekinil zabavo.

<sup>10</sup>Consider for example the ungrammaticality of (i.b):

<sup>11</sup>A note on intonation is needed. That is, when (12b) is interpreted as a wh-question with *že*-a, it will also receive a normal wh-intonation. On the other hand, *že*-r in (12b) is emphasized and the question ends with a rising intonation (similar to the intonation in *yes/no*-questions). Interestingly, (12a) does not receive a true wh-reading if we change the intonation and the only interpretation it can receive is the 'remind me'-reading. This implies that the intonation does not trigger the 'remind-me' interpretation of the wh-question.

9 Surviving sluicing

needs to name the author of Guernica. The speaker knows the answer, but cannot remember it, so (s)he utters:<sup>12</sup>

(14) Seveda of course vem, know kdo who je aux naslikal painted Guernico. Guernica.acc Kdo who že? že 'Of course I know who painted Guernica? (I need to remember) Who?'

Crucially, in (14) *že* cannot be interpreted as an aspectual adverb. Given that aspectual adverbs are located in the IP area and as sluicing is said to delete the entire IP area, the lack of aspectual reading for *že* is expected. And as *že*-r is available in the structure where IP is supposedly missing, we have an argument to assume *že*-r originates inside the left periphery. We return to this questions below in §4.

### **2.2** *Pa* **as a discourse particle**

Following Snoj (2009), *pa* (which has counterparts in several Slavic languages, for example in Serbo-Croatian as *pa* and *pak,* meaning 'again' or 'then', and Czech *pak* 'then, after') is related to *paky* 'again', 'also' in Old Church Slavonic and originates from Proto-Slavic *\*pȃkъ*, which originally meant 'differently', 'again', 'later', and probably also 'wrong' and 'bad'; see Snoj (2009) for more information on the etymology of *pa*.

Today, *pa* is a very common element in Slovenian, especially in colloquial language. The particle *pa* can be used in regular coordinations (similarly to standard Slovenian 'and'), (15), and as a subordination complementizer such as the standard Slovenian *ampak* 'but'. In the latter use *pa* typically appears in the second position (see Marušič et al. 2011 for more data), as can be seen from the examples in (16).

(15) Peter Peter pa and Ana Ana plešeta. dance 'Peter and Ana are dancing.'

(i) A q pomagam help.1sg ti you.dat naj? should Zakaj why že? že 'Oh, I should help you? Why?'

<sup>12</sup>In sluicing, *že* can be used in rhetorical questions, already discussed above. That is, (i) can be used in the situation described in (8).

Franc Marušič, Petra Mišmaš, Vesna Plesničar & Tina Šuligoj

	- b. Peter Peter pleše, dances ampak but ne not poje. sings 'Peter dances, but does not sing.'

The particle *pa* can function as a topic marker or as a contrastive focus marker in declarative sentences. *Pa* used as a topic marker is given in (17). In the context where friends are talking about various people dancing and someone asks about a certain person called 'Peter', (17) could be a natural reply. *Pa* can also be a contrastive focus marker, as in (18).


*Pa* can be a topic/focus marker in wh-questions as well. In this role, *pa* interacts with an emphasized constituent. Based on an emphasis (marked with smallcaps), the meaning of the question in (19) varies slightly, however, we are here focusing on *pa* as a discourse marker, so we are leaving these cases aside.

	- b. Kdo who pleše dances pa pa s with Petrom? Peter.ins '(We know about who dances with the others, but we want to know) who dances with Peter?'

As a discourse marker, *pa* is associated with a strongly presupposed context (see Cheng & Rooryck 2000 for this interpretation of wh-in situ questions in French). That is, the situation is established and/or is presupposed and we are seeking details about the situation. Hence, just like what Cheng & Rooryck (2000) claim for French, a negative answer to a wh-question with the discourse particle *pa*

### 9 Surviving sluicing

is unexpected. For example, if we ask (20a) we already know that someone was visiting we just do not know who was visiting. Getting a negative answer ('Nobody.') is not impossible, but it would be surprising for the speaker to get this answer. Side note, (20b) shows that *pa* can appear before or after the auxiliary clitic, just like *že*, which again indicates that the particle and the wh-phrase do not form a constituent.

	- b. Kdo who je aux pa pa bil was to? this

This reading, related to the strongly presupposed context, is also available in sluicing.<sup>13</sup> So, if we hear (21a) and we reply with the sluices in (21b) or (21c), this means that we potentially already knew (21a) or we fully accept (21a), but we need additional information about what and when Ana was eating.

	- c. Kdaj when pa pa je pojedla? aux ate 'When?'

Examples in this section show that discourse particles can appear in sluicing in Slovenian, but more importantly, indicate that not only operator material survives sluicing, as we would expect given Sluicing-COMP generalization (Merchant 2001). The question is then why discourse particles in Slovenian are able to do so.

<sup>13</sup>*Pa* in sluicing can also be a contrastive focus particle – for example (21b) can also be interpreted as a response to a context in which we already know what Ana did not eat but we want to know what she did eat (cf. Marušič et al. 2015). While interesting, we are leaving this reading aside here.

Franc Marušič, Petra Mišmaš, Vesna Plesničar & Tina Šuligoj

### **3 Wh-phrases, discourse particles and… what else?**

While we have only considered particles thus far, we also need to consider instances of the so called contrast sluicing, i.e. cases "where the correlate is a focused definite expression, rather than an indefinite" (Vicente 2018: 12). We can find contrast sluicing in English as well (Merchant 2001: 36):

	- b. We already know which streets are being repaved, but not which avenues.

(Merchant 2001: 36, (81a,d))

Cases just like these exist in Slovenian, too, and in Slovenian, just as in English, the wh-phrase and the "contrast" can form a complex wh-phrase:

	- b. Vemo, know.1pl katere which ulice streets bodo aux ponovno again tlakovane, paved a but ne, not katere which avenije. avenues

'We know which streets are being repaved, but not which avenues.'

However, the availability of complex wh-phrases in sluicing in Slovenian, does not account for instances of discourse particles in sluicing, as already observed in Marušič et al. (2015). That is, based on the observations that discourse particles in Slovenian (i) can be separated from the wh-word by parentheticals, shown below for *pa* in a wh-question and a sluice, (24) and (25a), respectively, (ii) can appear after the auxiliary clitic, cf. example (12b) and (20b), which in Slovenian does not break syntactic constituents and (iii) that particles cannot appear with unmoved wh-phrases, Marušič et al. (2015) conclude that in Slovenian, discourse particles do not form a constituent with wh-phrases.


9 Surviving sluicing

b. Kaj, what po after tvojem your mnenju, opinion pa pa je pojedla? aux ate 'What, in your opinion did she eat?'

In fact, the same conclusion can be made based on examples that show that the same particle cannot appear after all wh-phrases in multiple sluicing in Slovenian. That is, while multiple sluicing by itself is acceptable in Slovenian (a multiple whfronting language) and while particles can only marginally appear after each of the wh-phrases in multiple sluicing, these have to be different particles (we are not discussing the particle *to* here, but see Marušič et al. 2015); compare (27b) with (27e). Imagine a context in which you lend your glasses to a friend who had a party and the next day, the friend comes by to explain the situation and you demand to know:

(26) Na on zabavi party je aux nekdo somebody.nom nekomu somebody.dat metal throw kozarce glasses in and jih them razbil. broke

'At the party, somebody threw glasses at somebody and broke them.'

	- b. \* Kdo who.nom pa pa komu who.dat pa? pa
	- c. Kdo who.nom komu who.dat pa? pa '(I want to know) Who (threw the glasses) to whom?'
	- d. Kdo who.nom pa pa komu? who.dat '(I want to know) Who (threw the glasses) to whom?'
	- e. ? Kdo who.nom pa pa komu who.dat to? to '(I want to know) Who (threw the glasses) to whom?'

If particles would form a constituent with each individual wh-phrase prior to movement, we would expect (27b) to be just as acceptable as (28b) in which the sluice consists of two complex wh-phrases that only differ in their case features. But as shown, this is not the case.

Franc Marušič, Petra Mišmaš, Vesna Plesničar & Tina Šuligoj

	- b. Kateri which slikar painter.nom katerega which slikarja? painter.acc 'Which painter which painter?'

This can then be taken as an additional argument against particles forming a constituent with the wh-phrase and shows that instances of sluicing with discourse particles are not simply parallel to cases in which a complex wh-phrase survives sluicing. But, crucially, this shows that discourse particles in wh-questions in Slovenian are not located within the wh-phrase.

Furthermore, in Slovenian 'contrast' sluices are not necessarily complex whphrases, but rather consist of a wh-phrase (simplex or complex) and a non-whphrase. Even more, this non-wh-phrase can be discourse given, (29), or new, (30).

	- b. In and s with kom who Črt? Črt.nom 'And Črt was with whom?'
	- b. Ne, no kje where Kekec Kekec.nom Mojco. Mojca.acc 'No, (s)he can't remember where Kekec met Mojca.'

Based on similar examples, Marušič et al. (2015) suggest that in sluicing in Slovenian, the non-wh-material in the left periphery is not elided but we can in turn take it as an indicator that the particles do not have to form a constituent with the wh-phrase in sluicing examples. In the next section, we maintain the analysis from Marušič et al. (2015) and focus on the position of discourse particles in Slovenian wh-questions and in doing so give new arguments for the proposed analysis.

9 Surviving sluicing

### **4 Position of particles**

While particles are well studied in some languages, for example in German, particles in wh-questions have not been previously studied in Slovenian (at least not within the generative framework). In what follows we focus on the position of particles *že* and *pa* in Slovenian wh-questions. Focusing on examples with sluicing we show that the particles are not a part of the clitic cluster in Slovenian, despite their lack of stress and what at first glance seems to be simply a clause second position. Furthermore, we take instances of particles in sluicing as evidence that these particles are not a part of the IP.

### **4.1 Discourse particles are not part of the clitic cluster**

Traditionally discourse particles *pa* and *že* are said to be part of the clitic cluster in Slovenian, specifically, Toporišič (2000) places them as the last clitics of the clitic cluster. Similarly, Orešnik (1985) suggests that at least one variety of the particle *pa* should be seen as part of the clitic cluster. Toporišič (2000) does not make any distinction between various types of particles *pa* and *že*, he considers all of them comparable to the negation clitic *ne* and other particles like *še* 'more'/'still', *da* 'that'/'yes', etc. If particles are part of the clitic cluster and if clitic cluster is a conglomeration of syntactic heads that is adjoined to the C head (as in Golden & Sheppard 2000), we would expect, contrary to fact, that particles would behave like clitics and should thus, just like other clitics within the same cluster, not be possible in sluicing, as shown in (31).

(31) Ilija Ilija mu him ga it nekje somewhere razlaga. explains Kje where že že (\*mu him ga)? it 'Ilija is explaining it to him. (Remind me) Where (is Ilija explaining it to him)?'

Given the assumptions explained above and the example (31) we cannot but conclude that the particles that we observe in sluicing in Slovenian must be DPinternal, while the particles that we observe in wh-questions originate from a position inside the IP, as the complementizer is the first clitic inside the clitic cluster. This goes against the findings of Marušič et al. (2015) and our own conclusions about the nature of these particles in sluicing and wh-questions. Our goal now is thus to show that the "cluster-final" particles are not truly part of the clitic cluster and that additionally, the (mainstream) assumptions about clitic placement explained above also need to be (at least partially) revised or discarded.

### Franc Marušič, Petra Mišmaš, Vesna Plesničar & Tina Šuligoj

First, as claimed by Marušič (2008), clitics forming the clitic cluster are not adjoined to C as they can easily appear following a word that should be located lower in the clause (cf. Bošković 2001 for BCS clitics). Orešnik (1985) gives another argument against placing the clitic cluster in the C head. As he puts it, the complementizer should not be seen as a part of the clitic cluster as focused phrases can split the complementizer from the rest of the clitic cluster, as in (32b) taken from Orešnik (1985).

	- b. … {in and / ker as / da} that Janez Janez si refl ga it lahko can kupi. buy '… {and / as / that} Janez can buy it.'

If clitics move in overt syntax, than the clitic cluster that is apparently not adjoined to C needs to be hosted by a lower head – a head within IP. So for the particle at the end of the clitic cluster that would mean its place of origin should also be somewhere inside the IP, which further suggests our analysis is simply wrong. We can dismiss this argument saying Slovenian clitics do not move in syntax (as suggested by Marušič 2008 and Marušič & Žaucer 2017) or that at least the clitic cluster is not composed in syntax, for which there also seems to be evidence given that the order of clitics inside the cluster is not universal and does not follow any order predicted by the assumed structure (cf. Marušič 2016), but let us try and argue against the cluster-internal position of the discourse particles also within the mainstream view on clitics.

As noted above, the two particles *že* and *pa* can actually appear either before or after the clitic cluster, as shown in (20b) for *pa* and in (12b) for *že*, and in (33) for both. Given that all other clitics forming the clitic cluster have a fixed wordorder (with some variation in the order of dative and accusative clitics), we can conclude that the two clitics are not part of the clitic cluster but appear either cluster-initially or cluster-finally by accident.

	- b. Kaj what {pa pa mu him je aux / mu him je aux pa} pa Žodor žodor narisal? drew 'What did Žodor draw for him?'

### 9 Surviving sluicing

Another argument given above to show these particles do not form a constituent with the wh-word can be turned around. As shown in (24) repeated here as (34), *pa* can follow the parenthetical 'in your opinion', but note that *pa* can also precede the parenthetical and appear on the other side of the parenthetical separated from the rest of the clitic cluster, (35). This suggests *pa* is an element independent from the clitic cluster that is located structurally higher than the final position of the clitic cluster.


Further, in some cases, *pa* and *že* can appear also inside the complex wh-phrase as in (36) and (37). Note that these examples do not constitute an argument for a wh-phrase-internal position of these discourse particles, as argued by Marušič et al. (2015), but they do suggest that these discourse particles are different syntactic elements from the clitics forming the clitic cluster.


(Marušič et al. 2015: (38))

And finally, clitics in Slovenian typically follow the first wh-phrase of a multiple wh-question, (38), while discourse particles can follow the first or second whphrase in a multiple wh-question with two wh-phrases, as examples in (27) show.

(38) Kdo who.nom {jih them je aux komu who.dat / \*komu who.dat jih them je} aux metal? threw 'Who threw them to whom?'

Given all that, regardless of our assumptions about clitics and the way clitic cluster is formed, discourse particles are syntactic elements that behave differently from clitics, so that we have no argument to posit they originate from the same region of the clause or that their surface position is in any way dependent on the

### Franc Marušič, Petra Mišmaš, Vesna Plesničar & Tina Šuligoj

surface position of the other clitics. Discourse particles and clitics behave differenlty in wh-questions, thus it is not unexpected that they behave differently also in sluicing.<sup>14</sup>

### **4.2 Position of particles with respect to adverbs**

An argument for the analysis that places discourse particles in the left periphery of a wh-question (and a sluice) comes from the behavior of adverbs. Specifically, the incompatibility of high sentential adverbs and sluicing in Slovenian. There are several suggestions with respect to the position of discourse particles. Zimmermann (2011) proposes that, perhaps universally, discourse particles tend to be realized in the periphery of the clause, but that some languages, such as German, should be exempt from this (i.e. in German discourse particles do not occur in the periphery but rather in the middlefield because they do not bare stress and unstressed elements cannot appear in the prefield in German).<sup>15</sup> Facts from sluicing in Slovenian in fact suggest that discourse particles do appear higher than high sentential adverbs.

Specifically, high sentential adverbs in Cinque's (1999) hierarchy of adverbs express speakers' attitude and are in this respect similar to discourse particles which express speakers' attitude towards the utterance (Zimmermann 2011). However, while particles can appear in sluicing in Slovenian, high sentential adverbs cannot. This is shown below for the adverb *menda* 'allegedly' (but the same is true for *baje* in non-standard varieties of Slovenian) – a relatively high adverb

(i) *Context:* 'Peter seems to have invited some people.' Und and wen who {vermutlich presumably / wahrscheinlich probably / anscheinend}? apparently 'And who did he {presumably / probably / apparently} invite?'

(Ott & Struckmeier 2016: (15b))

<sup>14</sup>An anonymous reviewer suggested our data are fully compatible with a view where the only relevant criterium for clitic cluster formation is PF adjacency. If we further assume pronominal and auxiliary clitics are IP clitics whereas discourse particles are CP clitics (as they are located in the left periphery – in the CP area), then IP clitics and CP clitics would have been adjacent at PF in the absence of sluicing, but they would have never been syntactically adjacent or part of the same complex head. And when sluicing would elide the IP, IP clitics would get deleted whereas CP clitics would survive.

<sup>15</sup>Ott & Struckmeier (2016), assuming that particles in German are located outside the vP, above sentential adverbs and negation, argue for a phonological approach to ellipsis in which material, which is in the background, is elided. This approach does not necessarily require movement. Crucially, Ott & Struckmeier (2016) show that in German sentential adverbs can appear in clausal ellipsis, contrary to Slovenian. This implies that while cases of sluicing with particles in Slovenian and German seem similar at first glance, the two are in fact different.

### 9 Surviving sluicing

that is compatible with wh-questions (that is, while seemingly higher adverbs such as *iskreno* 'frankly' can appear in wh-questions, they only receive subject oriented reading).

	- b. Kdo who je aux že že menda allegedly plesal danced tango? tango Available: '(Remind me) Who allegedly danced tango?' Available: 'Who allegedly already danced tango?'
	- c. Kdo who je aux menda allegedly *že* že plesal danced tango? tango 'Who allegedly already danced tango?'
	- a. Kdo who že? že '(Remind me) Who?'
	- b. \* Kdo who menda? allegedly Intended: 'Who, allegedly?'

First, the examples in (40) indicate that discourse particles precede high sentential adverbs in wh-questions in Slovenian, since *že* only gets the aspectual reading when it follows an adverb such as *menda* 'allegedly'. More importantly, high sentential adverbs cannot appear in sluices in Slovenian, indicating that the material in the IP is elided.<sup>16</sup> And since particles can appear in sluicing, this suggests that discourse particles in wh-questions in Slovenian are located above the IP.

	- B: In and kdo who menda? allegedly 'And who (danced) allegedly?'

<sup>16</sup>The apparent exception are contrastively focused adverbs as example (i) shows:

Franc Marušič, Petra Mišmaš, Vesna Plesničar & Tina Šuligoj

### **5 Conclusion**

Discourse particles in wh-questions in Slovenian have not been previously studied in Slovenian within the generative framework. In this paper we take instances of sluicing in which discourse particles *pa* and *že* appear as a starting point to explore discourse particles in wh-questions (and consequently sluicing) in Slovenian. We consider cases with *že* and *pa* in wh-questions and sluicing to show that discourse particles in Slovenian are not in complex wh-phrases nor are they a part of the clitic cluster or the IP. In fact, all of the properties we explore in this paper can be captured under the analysis proposed in Marušič et al. (2015), i.e. an analysis according to which discourse particles are located in the left periphery. Under this approach the projections hosting wh-phrases are not the only projections surviving sluicing in Slovenian, but rather what survives sluicing is a larger portion of the left periphery, hence also the grammaticality of topic and focus phrases in sluicing in Slovenian.

A natural question that follows (also pointed out by one of the anonymous reviewers) is why particles can survive IP-deletion in the left periphery, while auxiliaries like *did* and *do,* which end up in the left periphery following T-to-C movement, do not. The elements that we observe survive sluicing in the left periphery all originate from within the left periphery, while English auxiliaries do not; they are moved to the left periphery via T-to-C movement. One option to resolve this question is to simply state that the deletion of the IP in sluicing precedes T-to-C movement, as a result of which the auxiliaries never even reach the C head, where it could survive sluicing. As T-to-C movement is an instance of head-movement and as head-movement is occasionally argued to be an instance of PF movement, it actually follows quite naturally that elements like *did* cannot survive sluicing, as they do not occupy a left-peripheral position at the time when the IP is deleted.

### **Abbreviations**


9 Surviving sluicing

### **Acknowledgements**

We are grateful to the editors and two anonymous reviewers of this volume for comments and suggestions. We acknowledge the financial support of ARRS Program P6-0382 (PI: Marušič).

### **References**


Franc Marušič, Petra Mišmaš, Vesna Plesničar & Tina Šuligoj

*to Slavic Linguistics 16: The Stony Brook Meeting 2007*, 266–281. Ann Arbor, MI: Michigan Slavic Publications.


### 9 Surviving sluicing


### **Chapter 10**

## **The markedness of coincidence in Russian**

### Emilia Melara

University of Toronto

This paper presents a novel analysis of the Russian Infl domain. Specifically, it is argued in this paper that in Russian, the past tense, as opposed to the non-past, is the default, unmarked tense. Consequently, non-past in Russian is marked by the specification of a privative feature on T<sup>0</sup> , which associates the event/state expressed by vP to some anchoring time. This analysis stems from observations of how subjunctive matrix and complement clauses are interpreted. The analysis captures how, unlike other languages with the subjunctive mood, Russian allows main independent clauses to appear in the subjunctive. It additionally furthers work on features and properties of the Infl domain, showing how languages use different features, from what appears to be a limited set, to express time and realis contrasts.

**Keywords:** Russian, tense, subjunctive, Infl, realis and irrealis moods

### **1 Introduction**

This study examines the morphosyntactic features of the Russian inflectional domain by focusing on the selectional properties of the Russian subjunctive. Traditionally, the subjunctive is held to be a mood (whether or not there is overt morphology) that expresses an eventuality as hypothetical, advisable, desirable, or obligatory with respect to the sentential subject (Harrison & Le Fleming 2000: 142). In Russian, the subjunctive mood is expressed with the particle *by* and typically with the past-tense form of the predicate.

(1) Ty you uš-**l**-a leave-pst-sg.f **by** by domoj. home. 'You would {go / have gone} home.' (Mezhevich 2006: 152)

### Emilia Melara

Despite co-occurring almost exclusively with the past-tense verb form, however, constructions containing *by* show no semantic tense contrasts whatsoever (Spencer 2001: 298). This is illustrated in (2), where past, present, and futureoriented temporal adverbs are shown to licitly co-occur with the past tense verb form when *by* is present.

(2) Ja I **by** by uexa-**l**-a leave-pst-sg.f {včera yesterday / sejčas now / zavtra}. tomorrow 'I would {have left yesterday / leave now / leave tomorrow}.' (Mezhevich 2006: 136)

*By* can also co-occur with the infinitive form of the verb in an independent matrix clause.

(3) Oj oh s"es**t'** eat.inf **by** by Pete Peter.dat {včera yesterday / zavtra} tomorrow jabloko! apple 'If only Peter would eat an/the apple tomorrow!' or 'If only Peter would have eaten an/the apple yesterday!'

(Asarina 2006: 10)

Non-past finite forms of the predicate, on the other hand, are completely illicit with *by*.

	- b. \* Ja I ujd-**u** leave.pfv/fut-1sg **by** by domoj. home Intended: 'I would go home.' (adapted from Mezhevich 2006: 133)

This study stems from these observations. It asks: What can these co-occurrence patterns tell us about the interpretable features of the Russian inflectional system? I argue that *by* is the phonological spell-out of an irrealis head in the Russian inflectional domain, whose projection is semantically incompatible with the specification of any feature that situates a clause at the utterance context. Specifically, I will claim that this feature is [Coin(cidence)] (cf. Ritter & Wiltschko 2005, Ritter & Wiltschko 2009), which is hosted in T. A consequence, and perhaps the main take-away of this proposal is that the contrast between past and nonpast in Russian is distinguished by the specification of [Coin], past tense being the unmarked tense. This proposal is rooted in Distributed Morphology (Halle &

### 10 The markedness of coincidence in Russian

Marantz 1993; Embick & Noyer 2007) and builds on the feature geometry work of Cowper (2002; 2005) and others.

The outline of this paper is as follows. In §2, I describe the data considered for the analysis to be presented. It describes the tense system in Russian along with how the subjunctive is expressed in the language. §3 provides a background sketch of the subjunctive mood cross-linguistically and in the literature. In §4, I present an analysis of the data presented in §2. §5 expands the analysis presented to account for Russian subjunctive constructions as complement clauses. Finally, I conclude in §6.

### **2 The Russian system**

In Russian, most verbs come in aspectual pairs (Mezhevich 2008: 371) – an imperfective form and corresponding perfective form – and tense is often defined with respect to aspect (Mezhevich 2008: 373). In the indicative mood (that of "independent main assertive clause type[s]" (Wiltschko 2017: 1)), imperfective aspect allows for temporal distinctions among past, present, and a periphrastic future; perfective only allows for past and future readings (Mezhevich 2008: 371). Among non-past forms, aspect plays a role in distinguishing present from future. The examples in (5) and (6) shows the temporal-aspectual realizations for the verb 'fall', illustrating the Russian tense system.


Unlike Modern Russian, Old Russian made a distinction among four past tenses, namely, the aorist, the perfect, the pluperfect, and the imperfect (Mezhevich 2006: 38). Perfect and pluperfect constructions contained an inflected form of *byti* 'be' and a form commonly referred to as the l-participle: a verb containing the *-l*

### Emilia Melara

suffix. The distinction among the four past tenses was lost over time. What has remained is the *-l* suffix as the sole marker of past tense (ibid.).

Although historically it was the case that the *-l* suffix of the l-participle did not mark past tense itself, it has been argued that the suffix has been reanalyzed as the past tense morpheme in Modern Russian (see Mezhevich 2006 for a discussion and references). The form's distribution and interpretation in Modern Russian contrast with what are considered to be non-past predicate forms. I therefore treat the *-l* suffix that attaches to verbs as the past tense form here. In no way, however, do I assume that it exclusively expresses past tense. As shown in (2) and to be seen in later examples, when *-l* co-occurs with *by*, one interpretation the clause may receive is a past interpretation but in no way is such a construction restricted to that interpretation. A clause containing both these morphemes may also receive non-past readings.

Apart from the indicative, Modern Russian has only two formal moods: the imperative and the subjunctive/conditional (Cubberley 2002: 157). Russian does not have specific subjunctive verb forms (Mezhevich 2006: 118). Rather, subjunctive clauses are generally formed with the particle *by* and the l-participle, as in (7), repeated from (1), and (8).


(8) Liza Liza xote-l-a, want-pst-sg.f [čtoby čtoby Philemon Philemon uše-l]. leave-pst.sg.m 'Liza wanted Philemon to leave.' (Mezhevich 2006: 148)

Traditionally, the subjunctive is held to be a mood (whether or not there is overt morphology) that expresses an eventuality as hypothetical, advisable, desirable, or obligatory (Harrison & Le Fleming 2000: 142), as in (9), with respect to the sentential subject.


In Russian, the subjunctive pattern described above is used to express these semantic notions, for example, in (10) and (11).

10 The markedness of coincidence in Russian


'I would very much like to go to the theatre tomorrow.'

That is, in (10), the subjunctive is used to express advisability with respect to the subject and in (11), desirability. (10a) and (10b) illustrate that the imperfectiveperfective distinction is maintained in the subjunctive mood.

Although *by* derives from the aorist of the Old Russian auxiliary *byti* 'be', it has been reanalyzed as a marker of the subjunctive/conditional separate from the Modern Russian form *byt'* 'be'. The main distinguishing property between *by* and *byt'* is that the latter has a paradigm of inflected forms while the former does not; rather, it is a frozen morpheme (see Spencer 2001; Mezhevich 2006).

In matrix clauses, *by* most naturally appears following the main verb (Cubberley 2002: 200). However, it can also follow a focused element, appearing in the second sentential position (Spencer 2001: 298), as in (12). In theory, though, *by* can occur in any position except clause-initially (Hacking 1998, cited in Mezhevich 2006: 152; Spencer 2001: 298); see (13).


### Emilia Melara

It was noted in §1 that *by* cannot co-occur with a non-past-tense predicate. This is shown again in (14), repeated from (4).


Embedded under predicates that license subjunctive clauses, *by* surfaces clauseinitially with the indicative complementizer *čto* as a fused form (Brecht 1977).


Like *by* in matrix clauses,*čtoby* never appears with present or future morphology on the predicate.

	- b. \* Maša Maša xočet wants čtoby čtoby Petja Peter {est eat.ipfv.prs / s"est} eat.pfv.prs(=fut) jabloko. apple Intended: 'Mary wants for Peter to eat an apple.' (Asarina 2006: 7)

Unlike matrix subjunctive clauses, a past-tense reading is unavailable for a subjunctive complement clause, as shown in (17c); while present and future interpretations are possible, as shown in (17a) and (17b).

	- b. Ja I xoču, want čtoby čtoby Maša Mary sejčas now e-l-a ate-pst-sg.f jabloko. apple 'I want for Mary to be eating an apple right now.'
	- c. \* Ja I xoču, want čtoby čtoby Maša Mary včera yesterday s"e-l-a eat-pst-sg.f jabloko. apple Intended: 'I want for Mary to have been eating an apple yesterday.'

(Asarina 2006: 8)

### 10 The markedness of coincidence in Russian

In the case that the subjects of the complement and matrix clauses are coreferential, however, the subordinate predicate appears in its infinitival form (Cubberley 2002: 160, 236), as shown in (18). When the subjects of the complement and matrix clauses have disjoint reference, the subordinate clause appears with the complementizer *čtoby* and the past tense form of the embedded verb, as in (19). The disjoint reference requirement for the subject of the embedded subjunctive clause with respect to the subject of the matrix clause is called "subject obviation" (cf. Antonenko 2010: 1).

	- b. My we xote-l-i want-pst-pl ėto this sdelat' do.inf zavtra. tomorrow 'We wanted to do that tomorrow.' (Harrison & Le Fleming 2000: 143)
	- b. My we xote-l-i want-pst-pl čtoby čtoby vy you ėto this sdela-l-i do-pst-pl zavtra. tomorrow 'We wanted you to do this tomorrow.'

Matrix subjunctives, though, do not differ semantically regardless of whether the predicate appears with past morphology or in the infinitive (Asarina 2006: 10). Note, however, that the subject of the clause appears in its nominative form when the verb appears with *-l* but in its dative form when the verb is infinitival.

	- 'If only Peter would eat an/the apple tomorrow!' or 'If only Peter would have eaten an/the apple yesterday!' (Asarina 2006: 10)

The following section outlines properties of the subjunctive mood from a crosslinguistic perspective.

### Emilia Melara

### **3 The subjunctive mood**

The subjunctive mood contrasts minimally with the indicative (Quer 2006: 660; Wiltschko 2017: 218). However, neither cross- nor intra-linguistically does the subjunctive mood constitute a uniform category (Quer 2006: 661). Some subjunctive-related phenomena are present in some languages but absent in others that have the mood (ibid.). For example, Icelandic subjunctive clauses allow longdistance anaphors while Upper Austrian German subjunctive clauses do not (ibid.). Further, within a single language that has the subjunctive mood, there are subjunctive-related phenomena that are evident in some subjunctive clauses but not all (ibid.).

The subjunctive has frequently been considered a defective tense (e.g. Picallo 1984 and Giannakidou 2009) or at least impoverished semantically with respect to the indicative (see Cowper 2002; Cowper 2005; Schlenker 2003). As a completely defective tense, the subjunctive is claimed to be dependent on some higher structure for its temporal interpretation (Wiltschko 2017: 2). Proposals of this sort stem from the fact that in some languages (e.g. Spanish and Catalan), subjunctives cannot be used in matrix clauses; in these same languages, where the subjunctive appears in a complement clause, the time of the embedded clause is interpreted relative to that of the matrix clause (Wiltschko 2017).

A problem that has been noted concerning the idea that the subjunctive is a defective tense/impoverished morphosyntactically is that there are languages that have been argued to lack tense but have an active indicative-subjunctive distinction (Wiltschko 2017). For example, Wiltschko (2017) demonstrates that in Upper Austrian German, there is no dedicated form for the simple past tense and the bare verb in the indicative is compatible with a past, present, or future interpretation.


### 10 The markedness of coincidence in Russian

Wiltschko (2017) argues that in Upper Austrian German there is a subjunctive– indicative contrast active where a tensed language, for example Standard German, would employ the past-non-past distinction. For example, as shown in (22), subjunctive morphology appears on the verb, closer than agreement marking.

	- b. Nua only es you.pl kumm-**at-**ts. come-sbj-2pl 'Only you guys would come.' (Wiltschko 2017: 17)

Wiltschko claims that the subjunctive-indicative contrast is how the language anchors its clauses. This is evident from the fact that the subjunctive may be used in main independent clauses in Upper Austrian German, and therefore: a) subjunctive clauses are temporally independent, and b) the subjunctive does not create a transparent clause. The proposal, following Ritter & Wiltschko (2005; 2009), is that Infl, the locus of clausal anchoring, contains a [Coin(cidence)] feature which establishes a relation of either overlap or coincidence between Infl's two arguments (in the case of [+Coin]) or disjointness (as in the case of [−Coin]). It is the substantive (a.k.a. semantic) content of the morphology that determines the relation between Infl arguments, for example, time. In the case of Upper Austrian German, subjunctive marking values the [*u*Coin] feature in Infl as [−Coin], while indicative marking values it as [+Coin].

The negatively valued [Coin] feature of Ritter & Wiltschko (2005; 2014) roughly corresponds to Iatridou's (2000) exclusion feature: ExclF. ExclF can range over times or worlds and has the basic meaning presented in (23).

(23) ExclF: <sup>T</sup>(*x*) excludes <sup>C</sup>(*x*),

where <sup>T</sup>(*x*) means TOPIC(*x*) ("the *x* that we are talking about") and <sup>C</sup>(*x*) means CONTEXT(*x*) ("that *x* that for all we know is the *x* of the speaker")


(Iatridou 2000: 246)

### Emilia Melara

Essentially, ExclF and the negatively valued [Coin] feature share the property of establishing that two elements are disjoint.

The analysis to be presented in this paper adopts the feature proposed by Ritter & Wiltschko (2005; 2009), however as a privative interpretable feature of Infl. It also employs Cowper's (2002; 2005) feature geometry of interpretable Infl features. It will also be explained how ExclF, bearing basically the opposite semantics of [Coin], would be less parsimonious in accounting for the behaviour exhibited by the Russian subjunctive. To give away the punch-line, what surfaces is the claim that in Russian, the past tense is morphosyntactically unmarked (non-past being the marked tense) and the Russian subjunctive involves the spell-out of an irrealis head in Infl that is incompatible with the morphosyntactic specification of [Coin].

### **4 The proposal**

I argue in this section that *by* is an irrealis particle that spells out the head of a functional projection IrrP, which merges with TP in a fully articulated Infl structure. Despite proposing IrrP as a modified version of Cowper's (2010) MP, I make no claims here about modal operators in Russian subjunctive clauses or subjunctive clauses in general.

### **4.1 Theoretical framework**

The analysis to be presented adopts the inflectional system proposed by Cowper (2010), based on the feature geometry of the inflectional domain proposed in Cowper (2005). Her framework and the one presented here are rooted in Distributed Morphology (DM) (Halle & Marantz 1993; Embick & Noyer 2007; Bobaljik 2017), a theoretical approach according to which the syntax operates on feature bundles (i.e. lexical items or LIs) taken from the lexicon, combined in terminal nodes. Vocabulary items (or VIs) spell these features out at the phonological interface.

The interpretable, privative features of the Infl domain proposed by Cowper (2005) are divided according to mood, narrow tense, and viewpoint aspect, as shown in (24), where *α* and *β* are features in a dependency structure, in *α* > *β*, *β* is a dependent of *α*.


### 10 The markedness of coincidence in Russian

Figure 1: English Infl domain (Cowper 2010: 2)

The proposed dependency structure from Cowper (2010) for the English Infl domain is provided in Figure 1. 1

The specification of [Proposition] contrasts propositions from bare events or states. [Finite] is a syntactic feature that licenses nominative case and verbal agreement. [Deixis] anchors a clause to the moment of speech. [Modality] carries the semantics of necessity or possibility. [Precedence] encodes the meaning of past versus non-past. [Event] encodes the eventive (as opposed to stative) property of a predicate. Finally, the specification of [Interval] derives imperfectivity versus perfectivity. These features are realized on multiple functional heads which together constitute the inflectional domain of the clause.

Under Cowper's proposal, English modals merge in M(od) and subsequently move to T. TP, accordingly, is the projection of the feature [Proposition] given that only in propositions may the past/non-past distinction be realized. The viewpoint aspect features are realized in EP, which is not projected in stative clauses (Cowper 2010: 2). Moreover, the EPP is a property of the domain as a whole and is instantiated by the highest Infl head projected.

I assume here the TP, MP, and EP projections from Cowper (2010) along with the features [Finite], [Modality], and [Event]. [Modality] in my proposal is semantically impoverished in relation to its original proposal: (i) to avoid making any claims about subjunctivity and some relation with modality and (ii) because the semantics of *by* allows for modal interpretations within a superset of additional irrealis readings. I therefore refer to it simply as IrrP, projected by the instantiation of [Irrealis]. Another difference between the feature geometry pro-

<sup>1</sup>While Cowper (2010) proposes heads higher than TP, only the projections relevant to the present proposal are provided here.

### Emilia Melara

posed here and that of Cowper's is that I follow Ramchand & Svenonius (2014), assuming that propositional content is encoded higher in the clause, namely in the CP domain, rather than within Infl. For Ramchand & Svenonius, clauses are comprised of event (VP), situation (TP), and proposition (CP) domains, with transitional projections establishing relations among the domains. Specifically, AspP – essentially Cowper's (2010) EP – establishes a relation between the v/VP and TP, where an event is converted to a situation, while FinP (the lowest projection in Rizzi's (1997) split CP) establishes a relation between TP and CP, where a situation is converted to a proposition. It is in the CP that the propositional content of the clause becomes anchored to the utterance context, since that is the domain where speaker-oriented parameters reside. The diagram in Figure 2 shows these domain associations.

Figure 2: Domains & transitional projections (Ramchand & Svenonius 2014: 164)

I will claim that whereas past is marked relative to non-past in English, the opposite holds in Russian. That is, whereas past in English is the spellout of (minimally) [Precedence], Russian does not have [Precedence] in its Infl feature inventory. Rather, Russian has the feature [Coin] (Ritter & Wiltschko 2005; Wiltschko 2017; 2014) as a dependent of [Finite], and does not have [Deixis].2, <sup>3</sup> Unlike in

<sup>2</sup>The difference between [Deixis] and [Coin] lies in [Deixis] having been proposed as a feature that in English links temporal and speaker properties to the utterance context, whereas what [Coin] associates to the utterance context depends on where in the syntactic spine it is specified à la Ramchand & Svenonius (2014).

<sup>3</sup> [Interval], I claim, is also absent in Russian. Instead, the feature [Atomic] is a dependent of [Event], as I have argued based on the fact that stative predicates in Russian cannot bear nonderivational perfective morphology. See Melara (2014) for further discussion.

### 10 The markedness of coincidence in Russian

Wiltschko (2017), as was previously described, however, [Coin] is a privative feature. Moreover, while [Deixis] establishes an anchor to the utterance time relative to which [Precedence] situates the event, I claim that [Coin] anchors a proposition to the utterance context temporally within the Infl domain and personally (to the speaker) within the C domain. As a feature in Force, the head that hosts complementizers like English *that* and provides information about clause type, [Coin] associates the clausal content to the speaker's perspective.

### **4.2 The Infl system in Russian**

Adopting the tools from Cowper (2005; 2010), I propose the fully articulated dependency structure in Figure 3 for the Russian Infl system. Note that here, as mentioned above, [Irr] heads its own projection rather than being part of T, unlike in Cowper (2010) (note that my Irr corresponds to Cowper's Mod). I assume that a functional head cannot be projected in the absence of any specified features. Thus, while for Cowper, the lexical properties of modals also reside in Mod, I take Irr to be a purely functional head, merged only when [Irr] is specified. This is where a modal particle such as *by* in Russian is merged. Similarly, T is the projection of the feature [Fin(ite)].

Figure 3: Russian Infl dependency structure

The fact that the Russian subjunctive is compatible only with the past marker*-l* or the infinitive results from the selectional requirements of the functional heads in the Infl system. As stated earlier, I assume, based on Ramchand & Svenonius (2014), that the Infl domain temporally situates an event, while the C domain anchors the situation personally, both with respect to the utterance context.

As in Cowper (2010), EP is projected in non-stative clauses, selecting the vP. It is in E that non-derivational aspectual affixes reside. TP hosts the features

### Emilia Melara

[Fin] and [Coin]. [Fin] is the locus of nominative case and agreement. [Coin] establishes coincidence between the event described by the vP and the temporal properties of the utterance context. Russian, I claim, lacks any tense features. Instead, the past/non-past distinction is attributable to the presence or absence of [Coin]. Specified in Infl – the temporal domain – [Coin] semantically situates the event described by the clause to a non-past time and is spelled out by nonpast morphology. Both T and E bear a strong uninterpretable V feature [*u*V], requiring that v, containing V, move up at least to T to satisfy and check the [*u*V] of each head locally. I propose that in Russian, when [Coin] is absent, the past suffix *-l* is spelled out on the verb. That is, *-l* spells out a T specified only for [Fin], hence the past tense morpheme being unmarked relative to the non-past.

The [Irr] feature that *by* spells out encodes irrealisness. The irrealis meaning of [Irr] is semantically at odds with the binding established by [Coin]. When IrrP is projected, [Irr] scopes over the entire Infl domain (but cf. Cowper 2010 for discussion on how NegP is the highest projection in Infl) and essentially has the semantics of ExclF scoping over times, proposed by Iatridou (2000). As described in §3, ExclF is equivalent to [−Coin] from Ritter & Wiltschko's (2005; 2014) proposals. Thus, under an analysis according to which [Coin] is a privative feature, its specification coincides with the [+Coin] valuation and the anchoring of the proposition described by the clause to the utterance context. In case [Irr] and [Coin] were to be specified together, the Infl domain would be specified, in essence, for both [−Coin] and [+Coin]. If the Infl domain is what indicates whether an eventuality is anchored to the utterance context (temporally) as a whole, it cannot be both necessarily associated with and not associated with the utterance context, which is what specifying both + and − values for [Coin] would entail. Overall, there must be agreement within the domain with respect to the clause's association to the utterance context. Therefore, while Irr must check its [*u*V] feature, it cannot do so if [Coin] is specified on T. On the other hand, Irr may freely merge with a TP lacking [Coin]. In this case, *by* is spelled out with past morphology on the verb.

The well-formedness of *by* with the infinitive form of the verb is predicted in a similar fashion. In the absence of TP, Irr may merge directly with EP, satisfying its requirements for [*u*V]-checking in the same way as it would have in being merged with TP. As long as [Coin] is absent, Irr can freely merge with EP (or vP for that matter). Observe that the absence of [Fin] – whose specification licenses nominative case assignment and agreement – would predict that the subject not appear in its nominative form and the infinitive form of the verb would arise without subject agreement marking. This prediction is borne out, at least with respect to case assignment. Note again in the following example, repeated from

### 10 The markedness of coincidence in Russian

(20), that the matrix clause containing *by* and the *-l* suffix contains a subject in the nominative case. Conversely, the construction with *by* and the infinitive form of the verb contains a dative subject.

	- b. Oj oh s"est' eat.inf by by Pete Peter.dat {včera yesterday / zavtra} tomorrow jabloko! apple 'If only Peter would eat an/the apple tomorrow!' or 'If only Peter would have eaten an/the apple yesterday!' (Asarina 2006: 10)

Recall that the subject surfaces in a position higher than *by*. I assume that the EPP property holds of the highest head in the Infl domain. I make no commitment to any particular version of the EPP; for our purposes, it simply requires that the external argument appear in the specifier of the highest Infl head. I conjecture that the external argument may move to the specifier of T, where it receives case and values the uninterpretable phi-features of T. It may then move on further to the specifier of Irr, where it satisfies the EPP. In by+infinitive constructions, TP is absent, hence the lack of agreement on the verb.

I speculate that Irr, when [Irr] is specified, bears some sort of feature that is optionally strong, allowing for the various available positions of *by* within the clause. It is unclear what exactly this feature is and why it optionally takes the verb or the VP more locally. An alternative explanation would be that *by* is phonologically a clitic, which would capture why the form cannot appear clauseinitially. In fact, there is no generally accepted theory of Russian word order as of yet (see Kallestinova & Slabakova 2008 and Bailyn 2011 for discussion), with subjunctive data muddying the waters even more. What the reader, I hope, has been convinced of is that *by* spells out a head in the Infl domain. The form interacts directly with Infl categories/properties, namely tense and finiteness, both in terms of distribution and interpretation. If *by* were to spell-out a feature in the CP domain, one would expect it to licitly appear clause-initially, which it can't. While I have discussed only SVO-ordered clauses, work on *by* in other word orders would shed light on *by*'s position variability.

In summary, *by* is incompatible with the non-past tense because the non-past morphology spells out the feature [Coin], which itself is semantically at odds with the lack of connection to the utterance context encoded by [Irr], which *by*

### Emilia Melara

spells out. It is the lack of [Coin] in infinitival constructions that allows them to appear with *by*. Table 1 lists the featural specifications of the indicative and subjunctive possibilities that have been discussed.<sup>4</sup>


Table 1: Indicative and subjunctive morphology in Russian

Overall, *by* requires that the event not be bound by the utterance situation, therefore it cannot be anchored with respect to person or time. This conforms to Jespersen's (1924: 319), cited in Cowper (2002: 10) claim that the subjunctive expresses a perspective other than the speaker's. Moreover, the semantics expressed by *by*, such as obligation, desirability, advisability, hypothesis, are captured by this analysis in treating *by* as an irrealis particle.

### **5** *By* **in complement clauses**

Work on *by* typically makes note of the particle's tendency to move to second position in a clause when some sort of complementizer appears in C (Hacking 1998: 29). For instance, there is a strong tendency for *esli* 'if' and *by* to appear adjacent to one another in the antecedent of a conditional, as in (26a). An antecedent with *esli* in which *by* appears farther from the complementizer, as in (26b), is degraded for many speakers.


<sup>4</sup>Concerning line 3 in Table 1, one could think of *by* as requiring that the clause within which it appears is specified for [−Coin] (in both the Infl and C domains). The postulation of binary features in this analysis, however, would lead to overgeneration.

### 10 The markedness of coincidence in Russian

*Čto* 'that' also bears a tight relation to *by*. It has been noted, however, that there are speakers for which (27a) is interpreted as equivalent to (27b). For those who do not get the same interpretation, (27a) merely sounds like an incomplete embedded conditional (Brecht 1977: 40).

(27) a. Ja I nikogda never ne neg duma-l, think-pst **čto** that Jura Jura **by** by ėto this sdela-l. do-pst b. Ja I nikogda never ne neg duma-l, think-pst **čtoby** čtoby Jura Jura ėto this sdela-l. do-pst 'I never thought that Jura would do that.' (Brecht 1977: 40, fn. 10)

Given the high markedness for speakers, it might be that *esli by* and *čtoby* are separate lexical items from the independent *esli*, *čto*, and *by*. Brecht (1977) shows, though, that when the embedded clause is comprised of two (and presumably more) conjuncts, *čtoby* appears in the first clause and the second conjunct contains only an instance of *by*, as in (28) (see similar discussion on *esli by* in Hacking 1998: 29-32).

(28) Ty you vele-l, order-pst **čtoby** čtoby ja I uexa-l leave-pst v at Minsk Minsk odin, alone a and Vasja Vasja **by** by ostalsja remain s with toboj? you 'Did you order that I leave for Minsk alone and Vasja remain with you?' (Brecht 1977: 36)

Furthermore, Barnetová et al. (1979), cited in Hacking (1998), suggest that an element that appears between *esli* and *by* receives a focused reading. In fact, according to a consultant of my own, the following receives a reading according to which *Nikol'* has contrastive focus.

(29) Esli if Nikol' Nicole by by mne me.dat skaza-l-a tell-pst-sg.f ja I.nom by by vstreti-l meet-pst ee her.acc v at škole. school 'If Nicole had told me, I would have met her at school.'

Suppose *čto* and *esli* and other related complementizers appear in Force, assuming Rizzi's (1997) split CP analysis. The structure of the C domain is shown in (30), where ">" simply expresses dominance. Suppose that this full-fledged structure may also be projected in Russian.

(30) ForceP > TopP > FocP > TopP > FinP (Rizzi 1997: 297)

### Emilia Melara

As previously mentioned, Force encodes information about clause type and FinP works in tandem with ForceP to select either finite or non-finite IPs (Rizzi 1997). I have argued in Melara (2014) that complement clauses selected by propositional attitude verbs lack a feature that links a clause to the perspective of the speaker, accounting for cross-linguistic differences in what has traditionally been referred to as sequence of tense phenomena. For example, in English, a past tense in a complement clause embedded under a matrix past tense will be interpreted either at or before the time of the matrix clause event (thus, exhibiting sequence of tense). This is shown in (31). In Russian, the embedded clause in the same tense configuration can instead only be interpreted as prior to the time of the matrix event, not coinciding with it (i.e. it does not exhibit sequence of tense with complement clauses). This is shown in (32). Crucially for both languages, the forward-shifted reading in complement clauses is impossible.

### (31) John **said** that Mary **was** pregnant.


In line with what I am arguing for here, I proposed that indicative clauses must be both personally and temporally anchored. In matrix clauses, this is accomplished by a temporal deixis feature in Infl, a personal deixis feature in C/Force, both, or by default when there is no feature specified to express otherwise. In the absence of these anchoring features in T or C, perhaps because a language lacks them altogether, the clause is anchored by default to the utterance time and speaker in matrix clauses. Embedded clauses lacking these features are temporally and personally anchored to the time and viewpoint of the (Agent/Experiencer) subject of the embedding clause. Accordingly, in both of the English and Russian

### 10 The markedness of coincidence in Russian

sentences above, the embedded clause lacks the personal anchoring feature in Force and the embedded clauses are interpreted relative to the perspective of the matrix subject. What makes the temporal interpretations different between the two languages, though, is that English has an anchoring feature in Infl (Cowper's 2005 [T-deixis]), while Russian does not, hence the English complement clause is thus temporally independent while the Russian one depends on the temporal interpretation of the higher clause.

I claim that in Russian, the same personal anchoring feature is in complementary distribution with *čto* 'that'. Let's also call this feature [Coin], manifested in the propositional domain, where anchoring to the utterance context via pointof-view is established. As I have claimed, *by* cannot be bound by the utterance context, due to the irrealis semantics of [Irr]. If Fin is the head that establishes a transition from situation to proposition (Ramchand & Svenonius 2014), then it is possible that [Irr] moves into Fin when the CP domain is projected in order to scope upward within the C domain to ensure that it is not being bound to the utterance context, in violation of [Irr]. This correctly predicts that it is possible, though marked for many speakers, to have a focused element between the complementizer and *by*. Furthermore, it captures *by*'s preference for the second position in the clause when the C domain is overtly projected.

If indeed [Coin] in Force creates a barrier for inter-clausal operations like temporal anchoring, then we can explain why subjunctive complement clauses embedded under a non-past matrix clause cannot receive a past tense interpretation. (33), repeated from (17), shows that a past tense subjunctive clause under a nonpast matrix verb can receive a present or future reading but not a past one.

	- b. Ja I xoču, want čtoby čtoby Maša Mary **sejčas** now e-l-a ate-pst-sg.f jabloko. apple 'I want for Mary to be eating an apple right now.'
	- c. \* Ja I xoču, want čtoby čtoby Maša Mary **včera** yesterday s"e-l-a eat-pst-sg.f jabloko. apple Intended: 'I want for Mary to have been eating an apple yesterday.' (Asarina 2006: 8)

The presence of *čto* in Force tells us that Force is not specified for [Coin]. This means the lower clause is temporally anchored to the time of the matrix situation. Given that in a matrix non-past context, the higher clause is specified for

### Emilia Melara

the temporal [Coin], the lower clause may only be compatible with readings that arise from the specification of temporal [Coin]. In order to get a past interpretation of the subjunctive complement clause, the matrix verb must appear in its past tense form, as in (34).

(34) Ja I xote-l, want-pst čtoby čtoby Maša Mary včera yesterday s"e-l-a ate-pst-sg.f jabloko. apple 'I wanted for Mary to have eaten the apple yesterday.'

I claim that Russian present and future tense forms both spellout [Coin], hence their similar morphological forms. Their interpretation as present or future arises from their aspectual properties. The future reading in (33a) is therefore licit, since nothing featurally blocks the reading.

Finally and speculatively, it is possible that*čto* and *by* are over time lexicalizing as a single item, with *esli* + *by* lagging slightly in the same process. I leave this question for future research.

### **6 Conclusion**

This paper has investigated the morphosyntactic properties of what the literature refers to as the Russian subjunctive. The particle *by*, which is used to form this type of construction in Russian, has been argued to be the spellout of an irrealis head Irr. This functional head was proposed to be the highest head of the Russian Infl system, taking a TP, EP, or vP as its complement. I have claimed that Irr encodes irrealis semantics. That is, the projection of this head – the specification of the feature [Irr] – establishes that the proposition denoted by the clause is not bound to the utterance context. Its projection is therefore incompatible with the feature [Coin] in either the Infl or C domains as [Coin]'s specification binds a clause to the utterance context temporally or personally, depending on where it is specified. This captures the lack of temporal dependency matrix subjunctive clauses exhibit and the lack of commitment on the speaker's part towards the proposition expressed by the subjunctive clause. Moreover, the fact that *by* cannot appear with non-past morphology stems from the proposal that non-pasttense morphology is the spellout of [Coin]. In essence, then, the subjunctive– indicative mood (or better yet, the irrealis–realis) distinction in Russian is one that lies in the projection or non-projection of [Irr].

The analysis presented in this paper ultimately results in the proposal that the non-past tense is marked relative to the past in Russian. Additionally, *by* spelling out a head whose semantics are inherently irrealis, the analysis presented also

captures the modal-like interpretations of the Russian clauses that contain *by*, which namely express obligation, desire, advisability, hypothesis, and so forth on the part of the subject. Also shown was the fact that *by* cannot appear in clause-initial position. This restriction was argued to be due to the fact that *by* moves to the head of FinP in the C domain, which itself is selected by one of the higher heads of an expanded CP layer.

As noted by a reviewer, clearly the analysis presented here runs *contra* the literature on the subjunctive. The subjunctive has typically been considered syntactically/semantically impoverished relative to the indicative mood. Under the analysis presented in this paper, the structure of the Russian subjunctive is structurally more marked compared to the indicative. Ultimately, this analysis supports Wiltschko's conclusion that while categories like indicative and subjunctive may be universal, the way in which they are constructed is language specific. While further work on the morphological closeness of *čto* and *by* ought to be conducted, the analysis presented in this paper has nonetheless proposed a framework of the language's Infl properties from which further work can springboard.


### **Abbreviations**

### **Acknowledgements**

I am greatly indebted to my committee members: Elizabeth Cowper – my advisor, Michela Ippolito, and Alana Johns, for their guidance, comments, and support on this project. Without any of it, this project would not have been possible. I also owe many thanks to Alëna Aksënova, Oleg Chausovsky, Julie Goncharov, Iryna

### Emilia Melara

Osadcha, and Magda Makharashvili for dedicating time to share their knowledge of Russian with me – thank you. Finally, I sincerely appreciate the time the reviewers of this paper contributed to help improve it – your feedback has been invaluable. Of course, any remaining errors are my own.

### **References**


10 The markedness of coincidence in Russian


### Emilia Melara

*choring events to utterances without tense*, 343–351. Somerville, MA: Cascadilla Proceedings Project. http://www.lingref.com/cpp/wccfl/24/paper1240.pdf.


### **Chapter 11**

## **Head directionality in Old Slavic**

### Krzysztof Migdalski

University of Wrocław

This paper investigates the issue of head directionality in Old Slavic. This issue has played an important role in diachronic studies on Germanic, in which a switch in head directionality was assumed to have triggered word order changes in the history of these languages. Within Slavic, Old Bulgarian and Old Church Slavonic have been claimed to partly feature head-final grammars by Pancheva (2005; 2008) and Dimitrova-Vulchanova & Vulchanov (2008), in contrast to contemporary Slavic languages, which are head-initial. This paper shows that there is little evidence for head-finality in Old Slavic.

**Keywords:** directionality parameter, clitics, participle movement, Old Chuch Slavonic, Old Bulgarian

### **1 Head directionality**

The hypothesis of head directionality has its roots in Greenberg's (1963) empirical generalizations concerning the position of the verb with respect to the direct object in the verb phrase and the correlation between object placement and the ordering of other elements. Greenberg observed that the order within VP has typological implications: VO languages have prepositions, whereas OV languages have postpositions. Within the framework of Principles and Parameters, this correlation is straightforwardly captured through the postulate of the head parameter, which implies that languages show variation concerning the order of the head with respect to its complement (see Vennemann 1972 and Dryer 1992; 2007 for discussion). On the assumption that in spite of crosslinguistic variation the head–complement order within a single language is invariant, in head-initial languages the complement always follows the head, hence the object follows the verb and the preposition precedes its nominal complements. Correspondingly,

Krzysztof Migdalski. 2018. Head directionality in Old Slavic. In Denisa Lenertová, Roland Meyer, Radek Šimík & Luka Szucsich (eds.), *Advances in formal Slavic linguistics 2016*, 241–263. Berlin: Language Science Press. DOI:10.5281/zenodo.2545527

### Krzysztof Migdalski

in head-final languages the object precedes the verbal head, the way a nominal complement precedes its postposition.

It has been observed, however, that not all languages display a consistent setting of the head parameter (see Hawkins 1980; 1982). For instance, a well-known case of inconsistency is that of German. Although German is predominantly head-initial, the verb is final in non-finite verb phrases, while adjective phrases may be both head-final and head-initial. In diachronic studies, it has been postulated that the setting of the head parameter may switch in language history. For instance, Pintzuk (1991) shows that although Old English (OE) featured mainly OV (head-final) structures, there were also minor instances of VO orders, as evidenced by exceptional structures involving particles, see (1a), and personal pronouns following the verb, see (1b).

	- b. We We wyllað want secgan tell **eow** you sum a bigspell parable 'We want to tell you a parable' (OE, Fischer et al. 2004: 141)

On Pintzuk's analysis, the post-verbal placement of particles and objects is indicative of the head-initial setting of VP, which in Old English constitutes a minority pattern. This pattern is assumed to be in competition with the more common head-final VP order instantiated by OV structures.

The hypothesis of grammar competition was postulated by Kroch (1989) in order to capture a period of diachronic variation between two structures that are not compatible with each other within a single grammar. Such two structures are assumed to represent two contradictory parameter settings (such as headfinal versus head-initial constructions), or, within the Minimalist framework, the presence of lexical items with contradictory features (see also Pintzuk 2002: 278). The postulate of grammar competition has resulted in many fruitful analyses of diachronically unstable structures. For example, Haeberli & Pintzuk (2006) investigate the position of the main verb and the auxiliary with respect to adjuncts and complements in verb clusters in Old English and attribute the observed word order variation to a switch in head directionality of functional projections in Old English.

11 Head directionality in Old Slavic

Within Slavic, a switch in head directionality is assumed to trigger a change in the cliticization in Pancheva's (2005) analysis. This paper argues for a different view, and it is organized as follows. §2 examines the arguments for head finality provided by Pancheva (2005) on the basis of a diachronic modification of cliticization patterns in Bulgarian. §3 overviews Pancheva's (2008) argumentation related to participle–auxiliary orders and the position of negation in Old Church Slavonic.<sup>1</sup>

### **2 Pancheva's (2005) analysis of head directionality in Old Slavic**

Most analyses of Old Church Slavonic syntax (Willis 2000; Jung 2015; Jung & Migdalski 2015; Migdalski 2016) assume that it was head-initial on a par with Modern Slavic languages. The exceptions are accounts due to Dimitrova-Vulchanova & Vulchanov (2008), who postulate that it was X<sup>0</sup> -final in the VPdomain and X<sup>0</sup> -initial in the CP-domain, as well as Pancheva (2005; 2008), who argues that it was T<sup>0</sup> -final on the basis of the position of pronominal clitics, negation, and participles with respect to the auxiliary. However, a challenge that these analyses face is the fact that a switch in head directionality should have triggered a major modification of the syntactic structure of these languages. Such a modification did not occur; moreover, in contrast to Germanic languages, all contemporary Slavic languages are strictly head-initial. In view of this, the subsequent section will show that there is little evidence for head-finality in Old Slavic. In §2.1 I provide an overview of Pancheva's analysis of diachronic Bulgarian data. In section §2.2 I present a criticism of her account.

### **2.1 Pancheva's (2005) study the diachrony of cliticization patterns in Bulgarian**

Pancheva (2005) provides a detailed analysis of the diachrony of cliticization patterns in the history of Bulgarian. She establishes that in the earliest stages (9th–13th c.), Old Bulgarian displays largely the same distribution of clitics as Old Church Slavonic. Namely, the clitics occur after the verb, as shown in (2). As the verb does not need to be located clause-initially, they are clearly not second position clitics. Although contemporary Bulgarian also features verb-adjacent cliticization, it normally disallows post-verbal clitic placement.

<sup>1</sup>This paper presents a further development of the analysis proposed in Migdalski (2016).

### Krzysztof Migdalski

(2) svętь holy bô because mõžъ man stvorilъ create.part.m.sg **ja** them.acc jestь is.aux 'Because a holy man has created them' (9th c. Bg, Pancheva 2005: 139)

Pancheva assumes, following Kayne (1991), Chomsky (1995), and corresponding analyses of verb-adjacent cliticization that underlyingly pronominal clitics are generated as VP arguments. They move from XP-positions in VP and left-adjoin to T<sup>0</sup> as heads. Crucially, the fact that the accusative pronominal clitic precedes the auxiliary verb in (2) leads her to assume that Old Bulgarian is a T<sup>0</sup> -final language, but all the other heads are initial.

(3) [TP [vP [V' t<sup>i</sup> V 0 ]] [<sup>T</sup> CL<sup>i</sup> T 0 ]] (Pancheva 2005: 139)

Another assumption made by Pancheva (2005: 146) is that although in Old Bulgarian lexical verbs leave vP, they do not reach T<sup>0</sup> but only Asp<sup>0</sup> located below T 0 . This means that her evidence for the final T<sup>0</sup> comes from the position of the auxiliary 'be' (such as *estь* in (2)) located in T<sup>0</sup> with respect to pronominal clitics (such as *ja* in (2)).

The post-verbal cliticization was the dominant pattern in Bulgarian until the 13th century. Subsequently, Wackernagel (second position) cliticization prevailed and remained the default type until the 17th century. Pancheva attributes this change to a switch in the head parameter of T<sup>0</sup> , which became head-initial. She claims that as a result of the switch pronominal clitics begin to appear in front of T 0 and their position with respect to the verb becomes reversed, as shown in the derivation in (4a). Since other elements may now occur between the verb and the clitic, the verb is no longer analyzed as the clitic host by the speakers. The clitics remain phonologically enclitic and receive prosodic support from their new hosts located in SpecTP, see (4b) and (4c), or SpecCP.


Pancheva notes a syntactic restriction on the lexical elements preceding second position clitics during this period. She observes that in contrast to contemporary

### 11 Head directionality in Old Slavic

Slavic languages with Wackernagel clitics, the clitics in the Bulgarian corpus data from that period occur strictly after the first word, which in some cases results in Left Branch Extraction. There are no instances of clitics following the first branching phrase. The same observation is made by Radanović-Kocić (1988: Chapter 3) for the earliest stages of the development of Wackernagel cliticization in Old Serbian. Second position cliticization with clitics preceded by unambiguous phrasal elements became available in Serbian only at a later stage. I take this correlation to mean that the Old Bulgarian data analyzed by Pancheva (2005) exemplify the initial stage of the emergence of second position cliticization, which was not completed. Incidentally, this syntactic restriction on second position cliticization cannot be captured by Pancheva's derivation presented in (4a), given that she assumes that the pre-clitic element is located in an XP-projection: SpecTP or SpecCP.<sup>2</sup>

The third stage of the diachronic change investigated by Pancheva takes place from the 17th c. onwards, when second position clitics in Bulgarian are reanalyzed as preverbal clitics. This pattern prevails in the 19th century and continues to be the default cliticization type in contemporary Bulgarian. Pancheva points out that this change was contemporaneous with the loss of obligatory topicalization to SpecTP. The topicalization affected a number of unrelated categories, including the demonstrative *tova* in (4b) and the subject *tïa* in (4c). Pancheva argues that the decline of topicalization had repercussions for the syntax of clitics: as SpecTP became filled less frequently, the clitics were no longer analyzed as hosted in second position by a constituent located in SpecCP or SpecTP. Instead, the clitics started to appear more frequently adjacent to the verb. In syntactic terms this meant, in Pancheva's view, that they were reinterpreted as items merged in X<sup>0</sup> positions, adjoined to functional heads in the extended projections of the verb,

(i) Skupe expensive (**li**) q knjige books (\***li**) q Ana Ana čita? reads 'Does Ana read expensive books?' (S-C, Bošković 2001: 27)

Bošković (2001: 31ff.) attributes the restriction to the syntactic deficiency of *li* in Serbo-Croatian, which is not able to support a specifier, and the focus feature of *li* may only be checked through head movement. In fact, this is a special property of "operator clitics" expressing the illocutionary force of a clause, which in many Slavic languages display special requirements concerning the categorial and syntactic status of their preceding element, in contrast to pronominal and auxiliary second position clitics. See Migdalski (2016: Chapter 3) for discussion.

<sup>2</sup> In some Slavic languages, such as Serbo-Croatian, the second position clitic *li*, which functions as a focus or interrogation marker, may also be preceded exclusively by single words, as illustrated in (i), following Bošković's (2001: 27) observation.

### Krzysztof Migdalski

see (5a), rather than as XP elements that move from argument positions within VP and head-adjoin to T<sup>0</sup> . <sup>3</sup> With the loss of second position interpretation, the clitics could be located lower in the structure, next to the verb, as shown in illustrated in (5b) for the reflexive clitic *sa*, which is left-adjacent to the verb *javi*.


### **2.2 Empirical problems with Pancheva's (2005) analysis**

Pancheva's analysis addresses a remarkably large set of data, covering different cliticization patterns in the history of Bulgarian. Although her empirical observations are impressive, the analysis suffers from a number of serious shortcomings.

First, the postulated link between head directionality and a cliticization pattern does not receive any support from synchronic considerations. As is wellknown, contemporary Slavic languages display two distinct patterns of cliticization (see, e.g., Franks & King 2000). On the one hand, Czech, Serbo-Croatian, Slovak, and Slovenian feature second position clitics, which obligatorily occur after the clause-initial element virtually irrespective of its category. This type of clitic distribution is illustrated in (6) for a sequence of auxiliary and pronominal clitics in Serbo-Croatian. The clitics can be preceded by a number of different categories, including the subject, see (6a), a wh-element, see (6b), and an adverb, see (6c).

	- b. Zašto why **smo** are.aux **mu** him.dat **je** heracc predstavili introduce.part.pl juče? yesterday 'Why did we introduce her to him yesterday?'
	- c. Juče yesterday **smo** are.aux **mu** him.dat **je** her.acc predstavili. introduce.part.pl 'Yesterday we introduced her to him.' (S-C, Bošković 2001: 8–9)

<sup>3</sup>An anonymous reviewer points out that Pancheva's account on the reanalysis of clitics fits into the economic factor assumed in grammaticalization, "Merge as a head, not a phrase." However, Jung & Migdalski (2015) show that this factor is challenged by the degrammaticalizaiton of pronominal clitics into weak pronouns, which occurred in Old Russian and Old Polish.

11 Head directionality in Old Slavic

On the other hand, two Slavic languages, Bulgarian and Macedonian, have verbadjacent clitics, which may not be separated from the verb by any intervening material, see (7a). As shown in (7b), these clitics do not need to target second position.

	- b. Včera Vera **mi go** dade. (Bg, Franks 2010: ex. (111d,c))

The Slavic languages that display these two cliticization patterns differ in a number of ways. For instance, only the languages with verb-adjacent clitics have definite articles (see Bošković 2016) and tense morphology (see Migdalski 2015; 2016). Crucially, they are all head-initial irrespective of their cliticization system.

Diachronically, the verb-adjacent pattern of clitics predates second position cliticization. It has been observed by Radanović-Kocić (1988) and Pancheva (2005) that in Old Church Slavonic pronominal clitics were predominantly verbadjacent, as shown for the dative clitic *mi* in (8a) and for the accusative clitic *tę* in (8b).


Although pronominal clitics could occur in second position in Old Church Slavonic, especially when the clause-initial element was a verb (and hence they were verb-adjacent), Radanović-Kocić (1988) points out that only three clitics appeared in second position without exception: the question/focus particle *li*, the complementizer clitic *bo* 'because,' and the focus particle *že,* see (9a)–(9c).

(9) a. Približi approach.aor.3sg **bo** because sę refl crstvie kingdom nbskoe. heaven 'For the kingdom of heaven is at hand.' (OCS, *Matthew* 3:2, Radanović-Kocić 1988: 152)

### Krzysztof Migdalski

b. Mati mother **že** foc jego his živĕaše live.imp.3sg blizъ near vratъ. gates 'And his mother lived near the gates.'

(OCS, Radanović-Kocić 1988: 152)

c. Ašte if **li** q oko eye tvoĕ your lõkavo evil bõdetъ be.pres.sg.n 'If your eye should be evil' (OCS, *Matthew* 6:23, Radanović-Kocić 1988: 151)

I observe in Migdalski (2016) that the second position clitics exemplified in (9a)– (9c) form a natural class of sentential (operator) clitics. The semantic property that unifies them is that they all encode the illocutionary force of a clause. The counterparts of these clitics in contemporary Slavic languages also target second position, regardless of whether their pronominal and auxiliary clitics also occupy Wackernagel position or whether they are verb-adjacent. Thus, as shown in (10), although Bulgarian has verb-adjecent clitics, the clitic *li* is in second position, separated from the accusative clitic *ja* and the auxiliary clitic *je*.

(10) Včera yesterday **li** Q Penka Penka **ja** her.refl **e** is.aux dala give.part.f.sg knigata book.the na to Petko? Petko 'Was it yesterday that Penka gave the book to Petko?'

(Bg, Tomić 1996: 833)

The fact that Pancheva (2005) disregards the categorial status of clitics located in respective positions in her estimates of the different types of clitic placement is a major drawback of her analysis. In fact, this problem has been also pointed out by Dimitrova-Vulchanova & Vulchanov (2008), who, referring to Pancheva's (2005) analysis, note that in *Codex Suprasliensis* (a late Old Church Slavonic relic) the distribution of clitics is quite consistent and regular, and it does not seem to be a matter of statistical frequency or choice. Dimitrova-Vulchanova & Vulchanov observe that in *Codex Suprasliensis* clitics are found in second position if SpecCP is filled, otherwise they are post-verbal. Although Dimitrova-Vulchanova & Vulchanov do not provide any data in support of their observation, it is likely that that SpecCP is filled in the presence of operator clitics of the type exemplified in (9), which are uniformly hosted in second position.

In Migdalski (2016) I further observe that Pancheva's analysis is challenged by synchronic and diachronic cliticization data from Slavic. On the synchronic side, a problematic empirical fact is that the clitic forms of the auxiliary verb 'to be' in South Slavic languages occupy a different position with respect to pronominal

### 11 Head directionality in Old Slavic

clitics depending on their person feature content. Namely, as indicated for Serbo-Croatian in (11), the 3rd person auxiliary clitic (such as *je* in (11a)) is located to the right of the pronominal clitics, while all the other auxiliary variants (such as the 1 st person form *sam* in (11b)) are hosted to the left of the pronominal clitics.

	- b. Ja I **sam** am.aux **mu** him.dat **ih** them.acc dao. give.part.sg.m 'I gave them to him indeed.' (S-C, Tomić 1996: 839)

If Pancheva's account of cliticization were to be adopted to account for the auxiliary clitic placement, it would imply that in contemporary South Slavic languages T 0 is head-final in the structures with the 3rd person singular auxiliary, and that T 0 is head-initial with all the other auxiliary forms. This is not a welcome result given that the auxiliaries assume a different position in the structure purely depending on their person/number feature specification. The nature of this morphological contrast suggests that it does not involve alleged competition between two grammars that differ with respect to T<sup>0</sup> -initial and T<sup>0</sup> -final placement but rather that the contrast is entirely synchronic.

On the diachronic side, Pancheva's proposal of the switch in the head directionality of T<sup>0</sup> , which relies on the position of pronominal clitics with respect to the auxiliary, is seriously challenged by the timing of the diachronic modification of the auxiliary placement in the history of Bulgarian. I report in Migdalski (2016: 283–284), following Sławski's (1946) observations, that in Old Bulgarian all auxiliary forms followed pronominal clitics, as in the pattern in (2) above, which is used by Pancheva as evidence for the T<sup>0</sup> -final order. Two additional Old Bulgarian examples in which a non-third person auxiliary follows the pronominal clitics are given in (12). At first sight they may seem to lend support to Pancheva's analysis, since in contrast to contemporary Slavic languages, all auxiliary forms are located to the right of the pronominal clitics.


b. tvoè your zlàto gold što that **mu** him.dat **si** are.aux.2sg pròvodilь send.part.sg.m 'Your gold that you have sent to him' (17th c. Bg, Sławski 1946: 76)

### Krzysztof Migdalski

However, in the 17th–18th century the auxiliary placement in Bulgarian underwent a modification: the first and second auxiliary forms shifted across the pronominal clitics, adopting the current distribution (Sławski 1946: 76–77), as exemplified in (13). The timing of the modification is a problem for Pancheva (2005), as it took place when according to her analysis Bulgarian had featured T<sup>0</sup> -initial grammar for several centuries, with no second position clitics left.


I observe that the timing of the switch of the auxiliary forms indicates that second position cliticization is not related to the alleged loss of T<sup>0</sup> -finality or the position of pronominal clitics with respect to the auxiliary. The lack of the correlation between these properties is also independently confirmed by Jung's (2015) study of the auxiliary placement in Old Russian data. Jung points out that even though Old Russian had second position clitics until the 14th century, the first and second person forms of the auxiliary rigidly followed the pronominal clitics throughout this period. Furthermore, in Migdalski (2015; 2016) I develop an analysis of a diachronic switch from verb-adjacent to Wackernagel clitics in Serbo-Croatian, Slovenian, and Polish, showing that it was contemporaneous with the loss of tense morphology, analyzed as the loss of TP. It remains to be determined whether a related analysis can be applied to the Old Bulgarian facts noted by Pancheva (2005).

### **3 Pancheva's (2008) arguments for the final T<sup>0</sup> related to participle-auxiliary orders and the distribution of negation**

This section examines the arguments for the T<sup>0</sup> -finality of Old Church Slavonic that Pancheva (2008) provides in her later work. They are related to the syntax of compound tenses formed with the *l*-participle and the auxiliary 'be' and the interaction between negation and verb placement.

11 Head directionality in Old Slavic

### **3.1 Participle–auxiliary orders in Old Church Slavonic**

Most South and West Slavic languages feature a compound tense construction formed with the auxiliary 'be' and the *l*-participle; see (14a) for Bulgarian. The *l*-participle may be fronted across the auxiliary, as in (14b).

	- b. Čel read.part.sg.m sŭm am.aux knigata. book.the 'I have read the book.' (Bg)

This operation has received considerable attention in the literature since Lema & Rivero's (1989) analysis of the fronting in terms of Long Head Movement, which on their account proceeds via head raising of the *l*-participle from V<sup>0</sup> to C<sup>0</sup> across the auxiliary located in I<sup>0</sup> , as shown in (15).

(15) [CP [<sup>C</sup> Part*i*] [IP Aux [VP [<sup>V</sup> t*i*] DP]]]

The operation has also been analyzed as head adjunction of the participle to C<sup>0</sup> (Wilder & Ćavar 1994), to Aux<sup>0</sup> (Bošković 1997), or to a focus projection Delta<sup>0</sup> (Lambova 2003). I proposed in my previous work (Broekhuis & Migdalski 2003; Migdalski 2006) that the movement involves predicate inversion, which proceeds via XP remnant movement of the *l*-participle to SpecTP. This proposal accounts for a number of properties of the movement that had been unexplained in the previous analysis, such as the dependency of the phrasal movement on the presence of the auxiliary 'be' and the subject gap requirement, a property that will be important for the analysis presented in the remainder of this article.

Pancheva (2008) addresses similar cases of clause-initial participle placement in Old Church Slavonic, as illustrated in (16b).


### Krzysztof Migdalski

In principle, the Old Church Slavonic structure in (16b) most likely illustrates a counterpart of participle fronting attested in Modern Slavic, as has been argued for by Willis (2000: 325–327). Pancheva (2008) postulates, however, that on the assumption that Old Church Slavonic was T<sup>0</sup> -final, the ordering presented in (16b) could be taken to be the basic one, whereas the auxiliary–participle pattern in (16a) could be derived via rightward participle movement. In order to determine which order is the derived one, she calculates the ratio of both patterns.

Importantly, Pancheva (2008) notes that the participle–auxiliary order may be more frequent than the auxiliary–participle when the auxiliary is a clitic that needs prosodic support to its left. In order to limit the impact of the prosodic requirements on word order, she chooses to restrict her analysis to the structures involving the past tense auxiliary, which has a strong, non-clitic form. Furthermore, she assumes that the pattern that is a result of an optional operation will be statistically less common than the one that instantiates the basic order.

The results of her quantitative study show that both orders occur in a balanced proportion in Old Church Slavonic, though the participle–auxiliary pattern is less common than the auxiliary–participle pattern: 41% versus 59%. By contrast, in Modern Bulgarian the auxiliary–participle order is considerably more frequent and constitutes 97% of the data investigated by Pancheva, versus 3% of the participle–auxiliary orders. Pancheva states that on the assumption that Modern Bulgarian is T<sup>0</sup> -initial and that participle–auxiliary sequences are a result of participle movement to the left, the contrast in the ratio of the two constructions across the centuries indicates that Old Church Slavonic was a T<sup>0</sup> -final language.

The diachronic contrast in the ratio of participle–auxiliary orders is certainly interesting and requires an explanation, though it should be noted that even in Old Church Slavonic the participle–auxiliary pattern is less frequent. Pancheva (2008) makes use of additional argumentation to support her analysis. Namely, she acknowledges the fact that the different ratios of the participle/auxiliary patterns across centuries may have been due to different discourse factors that are reflected through these two orders rather than due to the switch in the T<sup>0</sup> -head parameter setting. Thus, it may well be the case that a particular discourse context started or ceased to be expressed through participle movement at a certain point in the history of Bulgarian. Yet, she ultimately rejects this possibility, referring to an observation of different ratios between active and passive participles preceding the auxiliary. She shows that in *Codex Marianus*, an Old Church Slavonic relic, active participles are placed in front of the auxiliary in 16% of cases, while passive participles precede the auxiliary in as many as 67% of cases. In Modern Bulgarian the rate is not that high. Pancheva argues that this contrast

### 11 Head directionality in Old Slavic

may point to a situation in which two grammars (T<sup>0</sup> -final and T<sup>0</sup> -initial) are in competition, and that the switch in the setting of the T<sup>0</sup> -head parameter was initiated among active participles, which as a result gave rise to a higher rate of the active participle–auxiliary orders.

I would like to propose an alternative explanation of the observed diachronic frequency contrast in the participle–auxiliary orders. As has been examined in detail by Lambova (2003), participle fronting in Modern Bulgarian triggers different discourse conditions depending on whether it occurs across the present perfect auxiliary clitic (see (17a) below as well as (14b) above) or the strong past perfect auxiliary, as in (17b). Given that the auxiliary in (17a) is prosodically deficient and needs to be supported to its left, the fronting of the participle (or of some other element) to the position in front of the clitic is obligatory. In contrast, movement of the participle across the non-clitic auxiliary, as in (17b), is optional. As was mentioned above, Pancheva restricts her diachronic analysis to the orders involving participle fronting across the past tense auxiliary, which correspond to the one in (17b), and in this way she avoids a potential influence of the clitic prosodic requirement on word order possibilities.

	- a'. \* **Sa** are.aux.3pl gledali watch.part.pl filma. movie.the Intended: 'They have watched the movie.'
	- b. Gledali watch.part.pl bjaxa were.aux.3pl filma. movie.the 'They had watched the movie.'
	- b'. Bjaxa were.aux.3pl gledali watch.part.pl filma. movie.the 'They had watched the movie' (Bg, Lambova 2003: 111–112)

Lambova (2003) points out that whereas the participle movement across the auxiliary clitic illustrated in (17a) is perceived as neutral, the fronting across the past tense auxiliary exemplified in (17b) necessarily produces detectable semantic effects and is perceived as "marked." This fact is reflected in the translation of (17b), with the main verb capitalized to show a focused interpretation. Lambova (2003: 113) argues that participle fronting across the past tense auxiliary is felicitous when "the speaker is presenting the activity under discussion as an alternative." Thus, the sentence in (17b) can be produced in a situation in which "the discourse

### Krzysztof Migdalski

contains either explicit or implied reference to the movie being in possession, i.e. rented or owned." (Lambova 2003: 113). In such a scenario, a potential paraphrase of this example is 'They have only seen the movie.' The main verb is pronounced with a high tone, as is typical of contrastively focused constituents in Bulgarian. These properties lead Lambova to suggest that when the participle raises across the past tense auxiliary, it lands in a higher projection than it does during the fronting across the auxiliary clitic. She terms this projection Delta Phrase and assumes it is a discourse-related projection located above CP, where focus is licensed.

In Modern Bulgarian participle fronting across the past tense auxiliary results in a special discourse effect, so it is not surprising that it is not often found in the corpus examined by Pancheva. What needs to be determined is whether a related discourse effect was produced by the corresponding participle reordering in Old Church Slavonic. It is likely that it did not. In fact, in §2.1 above I refer to a discourse-related syntactic change reported in Pancheva (2005: 153– 154), which occurred in Bulgarian between the 17th and the 19th centuries, and which involved the decline of obligatory topicalization targeting SpecTP. This change was accompanied by a reinterpretation of Wackernagel pronominal clitics as preverbal elements. Examples of the obligatory topicalization are given in (4) above and (18)–(20) below, and they include clauses with a topicalized object, see (4b), an adverbial participle, see (18), a finite verb, see (19), and an adverb, see (20). Pancheva notes that in Modern Bulgarian the corresponding structures are not felicitous.<sup>4</sup>


<sup>4</sup>Dimitrova-Vulchanova & Vulchanov (2008) observe a high frequency of structures of this type in Old Church Slavonic, which leads them to assume that VP is head-final in this language. However, they do not exclude the possibility of VP being head-initial, with the topicalization derived via movement.

11 Head directionality in Old Slavic


Even though the topicalization data provided by Pancheva (2005: 153–154) does not include examples with clause-initial *l*-participles, it is quite likely that they were also subject to the rule of obligatory topicalization. Broekhuis & Migdalski (2003) and Migdalski (2006) argue on the basis of Modern Bulgarian that fronted *l*-participles target SpecTP. If the same analysis can be applied to Old Church Slavonic (see Willis 2000) and Old Bulgarian, the historically high ratio of participle movement receives a straightforward explanation: it is a product of the obligatory topicalization to SpecTP.

Another factor that may have given rise to the higher frequency of participleinitial orders in Old Church Slavonic is the fact that the complex tense structures formed with the *l*-participle and the auxiliary 'be' were considerably less common in Old Church Slavonic than they are in the contemporary South Slavic languages. Thus, Dostál's (1954: 599ff.) estimates indicate that the *l*-perfect tense was used sporadically in Old Church Slavonic, and usually in subordinate clauses. Dostál's corpus study lists 10 thousand usages of the aorist, 2300 of the imperfect tense, and approximately only 600 instances of the perfect tenses (that is, approximately 5% of all the tense forms). The scarcity of the usage of the *l*-perfect compound tense in Old Slavic has been attributed to a number of factors (see Migdalski 2006: 26–27 for discussion). For instance, Bartula (1981: 100; see also Damborský 1967) notes that there are few examples of present perfect structures in the earliest Old Church Slavonic relics. They become more frequent in later manuscripts, such as *Codex Suprasliensis* and *Savvina kniga* (both from the 11th century). Most likely, the structures formed with the *l*-participle may have felt too novel and innovative for formal biblical texts. The fact that these structures were far less common in Old Slavic than in present-day Slavic languages may have repercussions for the different ratios in the participle–auxiliary patterns investigated by Pancheva (2008).

### **3.2 The position of negation in Old Church Slavonic**

The final observation used by Pancheva (2008) to support of her T<sup>0</sup> -final analysis of Old Church Slavonic is related to the interaction between negation and verb placement. It has been observed in the literature (see e.g. Rivero 1991) that in Modern Slavic negation may attract and incorporate into verbs, as a result of which the two elements form a single prosodic word. The process of incorpora-

### Krzysztof Migdalski

tion is evidenced by the placement of second position clitics in languages such as Serbo-Croatian, which follow the sequence of negation and the finite verb, as in (21).

(21) Ne neg {\***ga**} him.acc vidim see.pres.1sg {**ga**} him.acc 'I don't see him' (S-C, Rivero 1991: 338)

As will be discussed in more detail below, contemporary Slavic languages differ with respect to whether negation attracts the (finite) auxiliary verb or the *l*-participle. Pancheva (2008) shows that in Old Church Slavonic negation may attract finite verbs, see (22a), including the auxiliary, see (22b), and, in contrast to Modern Bulgarian, in some cases also the *l*-participle, see (22c).


Pancheva assumes that in Old Church Slavonic NegP is located above TP. In view of this assumption, the fact that negation may attract the *l*-participle and as a result produce the negation–participle–auxiliary pattern is taken by Pancheva to indicate a potential T<sup>0</sup> -final structure. According to her analysis, a T<sup>0</sup> -final structure can also be postulated for negation–auxiliary–participle orders on the assumption that negation attracts the auxiliary across the participle. Importantly, Pancheva claims that since Old Church Slavonic shows variation in the verbal structures involving negation, allowing both negation–participle and negation– auxiliary orders, it is likely that Old Church Slavonic features two grammars (T<sup>0</sup> -final and T<sup>0</sup> -initial), which are in competition.

I observe that Pancheva's (2008) hypothesis of the two competing grammars, posited on the basis of the distribution of negation, is challenged by diachronic and empirical facts.

Diachronically, the position of negation with respect to the verb exhibits categorial and semantic contrasts, which suggests that it is not related to grammar

11 Head directionality in Old Slavic

competition. Thus, Večerka (1989: 34; quoted in Willis 2000: 328) observes that the negation–auxiliary order is four times as frequent as the negation–participle order. Correspondingly, Willis (2000: 329) shows that the auxiliary–negation– participle pattern is not found in matrix clauses. This type of variation is unexpected if grammar competition is involved.<sup>5</sup>

Furthermore, in subordinate clauses the position of the conditional auxiliary *bi* is related to the semantics expressed by the complementizer, which in turn may have a repercussion for the position of negation with respect to the auxiliary and the *l*-participle. As observed by Willis (2000: 330), in Old Church Slavonic complementizers may attract the conditional auxiliary. The attraction is obligatory in the case of complementizer *a,* which introduces conditional clauses, see (23), but not with the complementizer *da,* which introduces indicative clauses, see (24).

	- b. A if **by** cond.3sg sьde here bylъ be.part.sg.m 'If he had been here'
	- c. A if **by** cond.3sg bylъ be.part.sg.m prorokъ prophet 'If he had been the prophet' (OCS, Vaillant 1977: 219)

(24) a. Drъžaaxõ held.3pl **i** him da that ne neg **bi** cond.3sg otъšelъ leave.part.sg.m otъ from nixъ them 'And they held him, so that he would not leave them' (OCS, *Codex Marianus*, Willis 2000: 330) b. Drъžaaxõ held.3.pl **i** him da that **bi** cond.3sg ne neg otъšlъ leave.part.sg.m otъ from nixъ them 'And they held him, so that he would not leave them'

(OCS, *Codex Zographensis*, Willis 2000: 330)

It can be assumed then that in subordinate clauses headed by the complementizer *a*, there will be no instances of the negation–auxiliary pattern, and that only the negation–participle order will be observed. Such a contextual, semanticdependent restriction would be surprising if the variation were due to grammar

<sup>5</sup>An anonymous reviewer points out though that embedded contexts may pattern differently in processes of language change. They may be more conservative than non-embedded contexts in the case of diffusion of a change.

### Krzysztof Migdalski

competition. Rather, it seems that at least in the environments presented in (23) and (24), the position of negation with respect to the verb is dictated by a syntactic mechanism, which in specific contexts becomes obligatory.<sup>6</sup>

Synchronically, Pancheva's assumption of the potential relation between the position of negation and the directionality of T<sup>0</sup> is challenged by properties of complex tense structures in contemporary Polish and Czech. Polish, which is clearly a T<sup>0</sup> -initial language, permits negation to either precede the auxiliary or the participle. The type of possible order depends on the type of the auxiliary involved. For example, negation attracts the future auxiliary (which morphologically is the perfective form of the verb 'be'), as shown in (25), but it adjoins to the *l*-participle rather than the perfect auxiliary in structures characterizing past events, as indicated in (26).


b. \* Nie-śmy parkowali tutaj samochodu. (Pl)

A corresponding variation is observed in Czech, which is also a T<sup>0</sup> -initial language. Thus, negation is adjoined to the *l-*participle, and it may not be adjoined to the auxiliary 'be'. However, negation adjoins to the verb 'be' when it is used as a copula. The distributional contrast is presented in (27) and (28).

(27) a. Přišel come.part.sg.m jsi. are.aux.2sg 'You have come.' b. Nepřišel neg.come.part.sg.m jsi. are.aux.2sg 'You haven't come.'

<sup>6</sup>An anonymous reviewer provides an additional empirical fact that challenges Pancheva's assumption of a link between the position of negation, cliticization, and head directionality. Namely, Old North Russian displayed both the negation–participle order (though negation could directly precede the copular 'be') and second position clitic system until the 14th century. On Pancheva's analysis the co-occurrence of these two properties would indicate that Old North Russian was simultaneously T<sup>0</sup> -initial and T<sup>0</sup> -final.

11 Head directionality in Old Slavic


	- b. Nejsi neg.are.2sg hlupák idiot / zdráv healthy / na on řadě. row 'You're not an idiot / healthy / It's not your turn.'
	- c. \* Jsi are.2sg nehlupák neg.idiot / nezdráv neg.healthy / ne neg na on řadě. row Intended: 'You're not an idiot / healthy / It's not your turn.'

(Cz, Toman 1980)

Since in Czech auxiliaries and copula verbs are morphologically identical (except for the fact that the auxiliary form is null and the copula form is overt in the 3 rd person singular and plural), the position of negation is clearly related to the categorial distinction between these two variants of the verb 'be'. Thus, in both Czech and Polish the position of negation and the verb is evidently contextually dependent.<sup>7</sup> It is not a result of statistical frequency and it is not contingent on the head directionality of TP.

### **4 Conclusion**

To conclude, this paper examined arguments provided in the literature, mainly by Pancheva (2005; 2008), in favor of head finality in Slavic on the basis of diachronic changes in the placement of clitics in the history of Bulgarian as well as the syntax of participles and the position of negation in Old Church Slavonic. It has showed that there is little evidence in support of head finality in Old Slavic, and that this claim is also challenged by empirical facts concerning the distribution of the auxiliary 'be' in the history of Bulgarian. Furthermore, the diagnostics used in favor of the head final analysis have been demonstrated to give wrong predictions when applied to the same patterns found in Modern Slavic.

<sup>7</sup>According to an anonymous reviewer, another factor that favors a categorial distinction between the copula and the auxiliary is the different timing of their loss in East Slavic languages such as Russian.

### Krzysztof Migdalski

### **Abbreviations**


### **Acknowledgements**

I wish to thank Željko Bošković, Hakyung Jung, Tanja Milićev, the FDSL-12 and FASL-25 audiences, and two anonymous reviewers for very helpful comments and discussion. All errors are mine.

### **References**


11 Head directionality in Old Slavic

Broekhuis, Hans & Krzysztof Migdalski. 2003. Participle fronting in Bulgarian. In Paula Fikkert & Leonie Cornips (eds.), *Linguistics in the Netherlands 2003*, 1–12. Amsterdam/Philadelphia: John Benjamins. DOI:10.1075/avt.20.04bro

Chomsky, Noam. 1995. *The minimalist program*. Cambridge, MA: MIT Press.


### Krzysztof Migdalski


11 Head directionality in Old Slavic


### **Chapter 12**

## **Perception of Bosnian/Croatian/Serbian sibilants: Heritage U.S. vs. homeland speakers. A pilot study**

Kristina Mihajlović University of Arizona

Małgorzata Ćavar

Indiana University

Many dialectal varieties of Bosnian/Croatian/Serbian (BCS) show some level of merger of standard BCS alveolo-palatal and hard post-alveolar affricate series. This paper reports the results of a pilot study of the perception of BCS sibilants by heritage speakers in the United States. Twenty speakers were given a forced identification task. Results indicate that second generation heritage speakers are worse in performance than first generation heritage speakers. Additionally, heritage Croatian and Bosnian speakers across generations perform worse than heritage Serbian speakers.

**Keywords:** heritage language, phonology, language change, merger, affricates, Bosnian/Serbian/Croatian

### **1 Introduction**

In Bosnian/Croatian/Serbian (BCS), the standard varieties have a typologically relatively rare contrast between "hard" post-alveolar affricates (/tṣ, dẓ/, compare the transcription of Polish hard post-alveolars in Ladefoged & Disner (2012), or Slavic transcription /tš, dž) and alveolo-palatal affricates (IPA transcription /tɕ,

Kristina Mihajlović & Małgorzata Ćavar. 2018. Perception of Bosnian/Croatian/Serbian sibilants: Heritage U.S. vs. homeland speakers. A pilot study. In Denisa Lenertová, Roland Meyer, Radek Šimík & Luka Szucsich (eds.), *Advances in formal Slavic linguistics 2016*, 265–288. Berlin: Language Science Press. DOI:10.5281/zenodo.2545529

### Kristina Mihajlović & Małgorzata Ćavar

dʑ/).1, <sup>2</sup> BCS has also hard-post-alveolar fricatives but no parallel alveolo-palatal fricatives. The inventory of sibilants in standard BCS is represented in Table 1.


Table 1: Sibilants of standard varieties of Bosnian/Croatian/Serbian (phonetic symbols with Latin orthography between slashes)

The contrast is cross-linguistically rare and, in fact, many dialectal varieties of BCS show different levels of merger of the two posterior places of articulation. The merging areas included Istria, northern Dalmatia, Dubrovnik, Boka Kotarska, the varieties spoken by the Muslims in Bosnia and Hercegovina, Catholics of eastern and northern Slavonia, areas of Banat and Timok-Lužnik, the dialect spoken in the capital of Croatia, Zagreb (Stankiewicz 1986:107, and references therein, Ivić 1958: 296, Żygis 2003 after Ivić 1958). Included, because, given the changed geopolitical situation after the Yugoslav war, the expansion of dialects at the cost of Čakavian, and the increased prestige of Kajkavian (spoken in the area of Zagreb), the situation described in before mentioned publications has evolved substantially. However, the authors of the current publication are not aware of any new comprehensive study of the current distribution of merging varieties. Other dialectal areas of BCS, such as Western Hercegovina, resist the merger entirely. Due to mobility of speakers between different dialectal areas, the current sociolinguistic situation is that of a daily interaction between speakers realizing the contrast in different ways and those not realizing the contrast. This is also the situation for the heritage language as spoken in the U.S., which we investigate in this study. The influence of English can potentially further facilitate the merger

<sup>1</sup>The hard post-alveolars are notoriously ambiguous, not only because of the variation in their realization in BCS. If there is no merger, phonetically they are neither sensu stricto retroflexes (though compare phonological arguments in Hamann 2003, e.g. for Polish) nor typical palatoalveolars – although the sounds that have undergone merger are probably to be described as palatoalveoalars. We identify the non-merged, hard post-alveolars as /tṣ, dẓ/, symbols used in Ladefoged & Disner (2012: 169) to describe Polish hard post-alveolars, where the authors made a strong case for using a non-IPA symbol distinct from the available symbols.

<sup>2</sup>We will continue using the Slavic symbols for the post-alveolar series for typographic reasons.

### 12 Perception of BCS sibilants: Heritage U.S. vs. homeland speakers

because one posterior place of articulation in English (palatoalveolar /t͡ʃ, d͡ʒ/) corresponds to a two-way place of articulation contrast in BCS (alveolo-palatal /tɕ, dʑ/ and post-alveolar /tš, dž/). The current study contributes to our better understanding of the merger processes and language change in general but also informs us of the strategies that native speakers adopt, first, when in close contact with different dialects and, second, with a dominant foreign language with conflicting phonological patterns.

§2 situates our study in the broader context of heritage language research. In section §3, the homeland situation is sketched. §4 is devoted to the methodology of the research and §5 provides the results. §6 relates the results back to the hypotheses and discusses the overall significance of the study while the paper is concluded in §7.

### **2 Heritage language studies context**

The variety of BCS our study focuses on is the language of the Bosnian, Croatian, and Serbian diaspora in the United States (in particular, Chicago and the surrounding areas, including Northern Indiana). Previous research on heritage languages suggests that the language of heritage communities differs in systematic ways from the language as spoken in the homeland communities. One factor here is the interaction between the L1 and L2 (in our case, the L2 is English) sound systems (Polinsky 2018 and references therein). Additionally, in heritage language communities, the close contact of different dialects must be considered. Due to several factors, such as heritage speakers' geographical separation from homeland speech, there is less pressure from the standard language(s). This allows natural language change to proceed unimpeded (or impeded to a much lesser degree) by prescriptive grammars, formal schooling and official media. Finally, it is often the case that heritage speakers do not reach the level of competence in their heritage language that would be comparable with competence of homeland speakers (Scontras et al. 2015). The language of the homeland is often not sufficient to serve as a universal communication tool in a different social context of the new country. Compensation strategies are developed to accommodate the needs of speakers, for example, code-switching. We believe that the situation of increased variation facilitates and accelerates language change.

In heritage language studies, there is no consensus concerning the definition of a heritage language and heritage speaker. Polinsky (2018) and Kelleher (2010) provide overviews of several definitions of heritage speakers and heritage language. Scontras et al. (2015: 1) describe heritage speakers as "unbalanced bilin-

### Kristina Mihajlović & Małgorzata Ćavar

guals, simultaneous or sequential, who shifted early in childhood from one language (their heritage language) to their dominant language (the language of their speech community)." The definition of a heritage language can include a language that is acquired naturally at home, yet may be treated as a minority language outside of the home (Polinsky 2018; see Kupisch et al. 2014 for further discussion of the definition of a heritage speaker). On the other hand, the term "heritage speaker" may also be used exclusively to describe bilinguals who were raised in a monolingual home (e.g., Polinsky & Kagan 2007).

The current study will operate with the definition of a homeland speaker as an individual who natively speaks language A and still resides in their homeland country in which language A is widely spoken. On the other hand, we define a heritage speaker as someone who has lived in a new country in which their native language A is not widely spoken for a substantial amount of time, with the focus of their life interest moved to the new country;<sup>3</sup> if they do speak their heritage language, it is usually reduced to being spoken in the home or with other speakers in a close-knit community. Heritage speakers are usually immigrants or displaced persons who have moved to a new country and their heritage language skills often range widely, from fluent monolingualism or bilingualism to total language loss through attrition (Scontras et al. 2015). This study includes speakers from each end of this spectrum as well as many who fall somewhere along the middle with respect to language attainment.

Another issue in heritage studies is the observed diachronic generation-togeneration change (Otheguy et al. 2007). In our study, we have identified Generation 1 as speakers who acquired their heritage language in the homeland natively, and Generation 2 as speakers who have primarily learned their heritage language in the new country. We also included speakers who immigrated to the U.S. as very young children before the onset of formal education in the Generation 2 category.

Further, Polinsky (2018) maintains that cross-linguistically, "immigrant and heritage varieties, separated from the ongoing change in the homeland, tend to retain features that are (or are perceived as) conservative" (Polinsky 2018: 129). Whether or not the language changes within the same generation is also an area of interest. In our study we ask if the language change is a function of time the speaker has spent within the new country. For example, in a large-scale analysis conducted using U.S. Census data of Spanish speakers living in the United

<sup>3</sup> For the purpose of our study, we adopted an arbitrary 7-year residence condition in the U.S. for the speakers of Generation 1 to exclude those new immigrants who would not have substantial exposure to English and the heritage language in the new country community.

### 12 Perception of BCS sibilants: Heritage U.S. vs. homeland speakers

States, Veltman (2000) found that "longer residence in the United States is associated with greater language shift." (Veltman 2000: 66) In the context of our study, we will readdress these questions, the cross-generation differences and linguistic conservatism, in the discussion section.

Addressing the issue of conflicting phonological systems, Polinsky (2018) assumes that heritage speakers generally use "the knowledge of contrasts from one language to the other only when such contrasts are useful," and suppress such differences when unnecessary (Polinsky 2018: 115). Using the concept of a distinction being "useful", we are forced to adopt a perspective of a uni-directional influence of the dominant language influencing the heritage language. In the context of the current study, this means that we would expect heritage BCS speakers with English exposure to suppress their categorical distinction of alveolo-palatal vs. post-alveolar sibilants because English lacks this place of articulation distinction and maintaining this contrast is not "useful" from the point of view of the acquisition of English. One needs to observe that – although "usefulness" is primarily described in relation to the dominant language of the new country, in our case the "usefulness" of the contrast in the heritage language itself can play a role, that is, in its functional load. The contrast between post-alveolars and alveolo-palatals has a low functional load in BCS.

### **3 Sibilants of homeland speakers**

BCS is a language continuum (called "macrolanguage" in Ethnologue) spoken in Croatia, Bosnia and Herzegovina, Serbia, and Montenegro with an estimated 15,260,000 speakers worldwide with considerable diaspora populations in the United States, Canada, and Australia (Simons & Fennig 2017). Figure 1 shows the geographical area where the homeland BCS speakers are located in Eastern Europe.<sup>4</sup>

As shown in Table 1, standard varieties of BCS contrast two posterior places of articulation for affricates. Conservative alveolo-palatals of BCS are articulated with an extreme raising of the tongue in the prepalatal area, while post-alveolars have the point of maximum constriction in the area of the alveolar ridge and just behind it, with the tongue body displaying no raising behind the constriction, see Figure 2. This is different from English where the palatoalveolars are articulated with a raised convex tongue but the raising is considerably less pronounced that in BCS alveolo-palatal sounds.

<sup>4</sup> Figure 1 is based on https://upload.wikimedia.org/wikipedia/commons/3/31/Balkan\_Peninsula. svg by https://commons.wikimedia.org/wiki/User:SilentResident CC-BY-SA.

Kristina Mihajlović & Małgorzata Ćavar

Figure 1: Southeastern Europe: Areas where homeland Bosnian/Croatian/Serbian is spoken

Figure 2: Alveolo-palatals (left) and post-alveolars (right) in a nonmerging variety (here: Serbian; adapted from Miletić 1958)

Škarić (2009), describing standard Croatian, distinguishes between three pronunciation types. First, the classical, virtually non-existent pronunciation type preserves a clear contrast. Second, the received pronunciation, characterizes careful speech of educated speakers with remnants of the contrast, in particular, COG values partly overlapping for the two places of articulation.<sup>5</sup> Finally, the generally accepted pronunciation is that the places of articulation are completely merged. One has to bear in mind that the measures presented in Škarić (2009)

<sup>5</sup>COG (Center of Gravity) is in acoustics a measure for how high the frequencies in a spectrum are on average (at a particular point of time). COG provides a convenient dimension of comparison for sounds with a noise component. For example, denti-alveolar fricatives have their COG in higher frequencies and posterior fricatives.

### 12 Perception of BCS sibilants: Heritage U.S. vs. homeland speakers

represent the speech of speakers recorded in Zagreb, thus in the merging dialectal area. The rendition of the standard may be heavily influenced by the local dialect and, thus, the picture might differ if the measurements were to include speakers in other Croatian major cities. Whatever the details are, the point we want to stress is that some varieties of the homeland speech have merger or merger in progress, and that there is a lot of variation in the realization of the two posterior places of articulation for speakers of BCS in Europe.

In our understanding of the "homeland" perception of the contrast between the two posterior places of articulation, we rely heavily on an earlier study by Ćavar & Hamann (2011), who tested speakers from Croatia and Bosnia and Hercegovina (to the exclusion of speakers from Serbia). The study was, however, guided by different research questions from those that we investigate today and used a different software and tokens recorded by a different speaker. In particular, Ćavar & Hamann (2011) was a study of inter-language perception and used tokens produced by a Polish native-speaker. On the other hand, Polish alveolo-palatal affricates are very similar, if not identical, to the Croatian ones as articulated in non-merging dialects of Hercegovina. The tokens have also included alveolopalatal fricatives, which are absent from Croatian, to test if Croatian speakers can use their ability to discriminate between alveolo-palatal affricates and hard post-alveolar affricates to perceive the contrast between Polish alveolo-palatal fricatives and post-alveolar fricatives. Unlike for the German control group, Croatian participants perceived all contrasts – including those absent from BCS – at the level comparable with Polish participants; see Figure 3. Four subjects out of twenty reached 100% accuracy in perception. The project also contained a produc-

Figure 3: Perception of alveolo-palatals versus hard post-alveolars in a forced-choice identification task (Ćavar & Hamann 2008)

### Kristina Mihajlović & Małgorzata Ćavar

tion component.<sup>6</sup> The same Croatian participants were recorded reading word lists containing Croatian sibilants. The highest ratio of mistakes in perception (up to 12%) was observed in subjects who do not produce a stable contrast between soft and hard affricate series in Croatian. Both subjects with the relatively highest error level come from the same area (of Zadar). Interestingly, the relation is not reciprocal: not all participants who do not produce consistent contrast have a higher ratio of mistakes in perception.

The rendition of the posterior place contrast showed a lot of inter- and intraspeaker variation. One speaker was switching between a more standard pronunciation and the dialectal pronunciation from Pag, where standard alveolo-palatals are realized as palatal stops. Out of twenty participants, ten articulated a stable contrast, for five participants the contrast was not realized in a reliable fashion, and five others did not have the contrast. Table 2 shows the geographical origins of speakers.<sup>7</sup>

Table 2: Homeland Croatian speakers by dialectal area


For those speakers whose pronunciation was not standard-like with two completely distinct categories, impressionistically five participants produced "hard" post-alveolars as more soft, four had alveolo-palatals shifted towards the "harder" series, three had variation in the production depending on whether the following vowel was front or back (only contexts of [e] and [a] were recorded). While only impressionistic descriptions are available at this point for all the original participants of the production study, it is clear that in homeland Croatian the perception of the contrast is surprisingly good given the high ratio of speakers with either complete merger or inconsistent rendering.

The contrast between alveolo-palatal and another post-alveolar place of articulation is relatively rare cross-linguistically (Maddieson 1984). The two sound series contrast, for example, in Chinese languages, in Polish, Serbo-Croatian and Lower Sorbian, in Ubykh and Abkhaz (cf. Ladefoged & Maddieson 1996). Some languages exhibit the contrast limited to individual manner of articulation and

<sup>6</sup>This was a study conducted by Ćavar & Hamann (2008).

<sup>7</sup>Another useful consideration is how many of the speakers of either generation were speakers of a border variety of BCS – that is, a variety spoken near a political border. We did not collect this type of information and therefore cannot comment on this.

### 12 Perception of BCS sibilants: Heritage U.S. vs. homeland speakers

phonation type and/or enhance it with additional secondary articulations. From the functional point of view, it might be beneficial to merge the two series. Alveolo-palatals involve higher level of raising from the neutral position than palatoalveolars (e.g. in English), in contrast, "hard" post-alveolars require a much flatter tongue position than for the palato-alveolars, thus, a high degree of articulatory precision is necessary to maintain the contrast. Further, the two series are relatively similar auditorily. For example, in the interlanguage forced-choice identification task by Ćavar & Hamann (2011), untrained German native speakers achieved only 55% accuracy.

Additionally, in BCS languages the contrast has a relatively low functional load. To our knowledge, there are no studies focusing on the functional load of the contrast, but homeland(s) grammars (such as e.g. Brozović 1991) cite usually only a couple of relevant minimal pairs.<sup>8</sup> The frequency of letters in Croatian generated by a character counter lists the letters *đ, ć, č, (d)ž* among the least frequent in Croatian – not counting the foreign letters *y, w, x*, and *f* – with the following percentages in Croatian: *đ* at 0.20%, *ć* at 0.49%, *č* at 0.92%, *ž* at 0.47%, all this bearing in mind that *ć* occurs in some frequent function words like *hoću* 'will.1sg'.<sup>9</sup> While the frequency of the letters in a written corpus cannot be interpreted directly to evaluate the frequency of sounds (for example, because *ž* is used in the representation of both the fricative ([ž]) and affricate ([dž]), it indicates that the sounds represented by the letters are at the bottom rank with regards to their functional load. No striking differences are expected between Croatian and other standard varieties. It is our understanding that low functional load might potentially facilitate the merger (cf. Wedel et al. 2013).

### **4 Methodology**

The study was guided by a number of research questions. First, we are interested, given the intra-language structural pressure and the influence from the contact language, whether or not Generation 2 merges more than Generation 1.

H1: Second-generation speakers merge more than first-generation speakers.

<sup>8</sup>E.g. *spavaćica* 'pajamas' vs. *spavačica* 'woman, sleeping'. Other minimal pairs include *posećen* 'visited' vs. *posečen* 'cut', *veće* 'bigger' vs. *veče* 'evening' (Serbian), *kuće* 'houses' vs. *kuče* 'puppy' (Bosnian), and *ćar* 'benefit', 'gain' (Bosnian) vs. *čar* 'charm'; data retrieved from https://forum. unilang.org/viewtopic.php?t=3028.

<sup>9</sup>We used the following character counter: https://www.sttmedia.com/characterfrequencycroatian.

### Kristina Mihajlović & Małgorzata Ćavar

Further, we are also interested in the influence of the dominant English on the speech of Generation 1 heritage speakers, in particular we have assumed that for Generation 1 there will be a correlation between the length of the stay in the U.S. and the amount of merger:

H2: First-generation speakers are more likely to perceptually merge the two categories the longer they have spent in the United States.

Finally, we want to verify earlier findings from a pilot study of Ćavar et al. (2016) that hard post-alveolars tend to be realized as (more) soft while alveolopalatals are relatively stable in the merger; see (1a). The other possible scenario in this merger process would be that both categories move towards the halfway point between alveolo-palatals and original post-alveolars (e.g. in terms of COG) to produce a palato-alveolar sound, as in (1b).

	- a. alveolo-palatals → alveolo-palatal hard post-alveolars → alveolo-palatal
	- b. alveolo-palatals → palatoalveolars Hard post-alveolars → palatoalveolars

The study included a production part (a reading task) and a perceptual experiment. This paper reports the preliminary results of the perceptual study.

In the perception study, participants were asked to listen to syllables containing one sibilant in various vowel contexts, either VC, CV, or VCV. Participants were asked to listen to syllables containing one of six sibilant sounds and then forced to identify which sound of a pair they perceived it to be. By utilizing a forced representation task in the experiment, participants were required to make a decision on their perception of the sound regardless of how confident they feel on their decision. During the experiment, the reaction time was also recorded, but so-far not analyzed.

The tokens were recorded by a female native speaker from Hercegovina who produces a stable contrast between alveolo-palatal and post-alveolar places of articulation in her speech. The native speaker read a list of nonce words composed of a sibilant sound surrounded by the same vowel on either side (e.g. /eće/, /eče/, /eše/, etc.). The data were obtained through a forced-choice identification task made with Paradigm (experiment adapted from Lee & Jongman 2016) and ran on

### 12 Perception of BCS sibilants: Heritage U.S. vs. homeland speakers

a Lenovo laptop with headphones on. Table 3 details the methods used to derive the 36 stimuli for the experiment; 6 stimuli came from each of the 6 sounds in the first column. Each stimulus was repeated once for a total of 72 tokens.<sup>10</sup> The final column gives the spliced stimuli which were used in the perception experiment.


Table 3: Stimuli used in the perception experiment

Tokens were presented using headphones connected to a computer. Participants were tasked with indicating what they hear by clicking either the left arrow for /ć/, /đ/ or /c/, or the right arrow for /č/, /dž/ or /š/, as shown in Figure 4.

<sup>10</sup>/c/ and /š/ were included in the perception experiment to act as a control set to contrast with the merging series. As such, the stimuli used for /c/ came from a different native speakers' recording and the sound was extracted independently with no surrounding vowel information. The stimuli for /š/ came from the original native speaker, but less environments were included. As expected, the perception of both sounds was perfect for all speakers across all dialects and all generations (see Table 6) because neither /c/ nor /š/ are merging in any variety of BCS we are aware of.

Figure 4: Perception experiment design

The symbols correspond to standard orthographic form in the Latinica (Latinbased) alphabet. While the post-alveolar sounds in the merging series (/č, dž/) were grouped together and the alveolo-palatal sounds (/ć, đ/) were grouped together, there was no other correspondence between the direction of the arrow and any phonetic characteristic of the sound, especially in the arbitrary case of /c/ and /š/. Participants were asked to complete a brief demographic questionnaire in which they provided information, for example, on the language background of their parents. The language background survey asked participants to disclose their


### 12 Perception of BCS sibilants: Heritage U.S. vs. homeland speakers

8. other homes ("other cities which you have visited/lived for more than three consecutive months"), including location (city, state, country), duration, and age.

Then for each language/dialect known (including native), participants were asked to provide


A complete copy of the questionnaire is available upon request from the first author.

We defined first generation (Gen1) speakers (*N* = 11) as those who moved to the United States at seven years of age or older because children start their formal education in the homeland at the age of 7 and from that time on they become increasingly exposed to normative language education. Generation 2, in contrast, includes speakers who emigrated as very young children (before formal schooling began), as well as those who learned their heritage language in the current country of residence from their Generation 1 parents. In our study, we have investigated only Generation 1 and Generation 2 speakers.

20 speakers participated in the study, most of them (*N* = 18) women.The age of participants ranged from 20 to 78 and was correlated with their generation (Generation 1 versus Generation 2), where Generation 1 was older and Generation 2 was younger. Generation 1 was comprised of 11 participants while Generation 2 was comprised of 9 participants. We divided the participant group into Bosnians, Croatians, and Serbians, based on the place of birth, or the place of birth of the parents (for Generation 2). Half of all participants (*N* = 10) were included in the Serbian group, six in the Bosnian and Herzegovinian group, and four in the Croatian group. Table 4 provides a summary of speaker breakdown.<sup>11</sup>

<sup>11</sup>Gen 1 age mean: 45.73 (SD 18.91); Gen 2 age mean: 25.00 (SD 11.36).

### Kristina Mihajlović & Małgorzata Ćavar


dialect breakdown 2 B, 3 C, 6 S 4 B, 1 C, 4 S

Table 4: Participants overview; B = Bosnian, C = Croatian, S = Serbian

### **5 Results**

The following sections present the findings. §5.1 presents the basic descriptive statistics, §5.2 compares the results for Generation 1 and Generation 2 speakers. In §5.3, the results of the current study are compared with the available results of homeland speakers in Ćavar & Hamann (2011). §5.3 addresses the length of stay in the U.S. as a potential factor influencing the merger in the first generation of immigrants and §5.4 looks at the potential differences in the perception of different categories.

### **5.1 General results**

Let us start with the ratios of incorrect responses for the three sub-groups of participants depending on the area of origin across the two generations. The data is further divided with respect to the type of sound – non-merging hard post-alveolars (/š/), merging hard post-alveolars (/tš, dž/), and merging alveolopalatals (/ć, đ/).

Due to the complex geolinguistic situation in the countries of former Yugoslavia, it was sometimes impossible to unambiguously identify the exact dialect spoken by participants in Generation 1, instead we use the area of origin as the variable. Generalized Linear Mixed Model (GLMM) analysis was conducted and the area of origin did not turn out to be a significant predictor in our data with *p* = 0.24 (model also includes generation, sound type, sound environment, age, and number of years since the acquisition of English). Croatian-origin participants' responses are marginally different from the Serbian group (*p* = 0.099). However, when looking at raw percentages, certain tendencies in the data can be observed. In particular, in the first generation Croatian-origin participants merge more than other groups. This difference is levelled in second generation.


Table 5: Percent of incorrect answers by area of origin, generation and sound type

### **5.2 Do the second-generation speakers merge more than the first-generation speakers?**

The absolute number of incorrect responses is much higher in Gen2 than in Gen1 in total as well as for each subgroup of participants separately. A GLMM analysis has been conducted controlling for generation as a factor. The difference between Gen1 and Gen2 is statistically significant (*p* = 0.005). For Serbian-origin participants, the difference between Gen1 and Gen2 is significant with*p* = 0.046, highly significant for Bosnian speakers (*p* = 0.001), however, this difference turned out to be insignificant for Croatian-origin participants (*p* = 0.739).

### **5.3 Does the duration of the stay in the United States influence the level of merger in Gen1 speakers?**

With regards to H2, in which we predicted that the longer the duration of stay in the United States, the more likely a speaker is to merge the categories. We found that our results do not support this hypothesis. In fact, the duration of

stay in the U.S. in Generation 1 speakers had no effect on the level of merger (*p* = 1). Figure 5, which was added in response to a reviewer's request, shows this lack of correlation between length of stay in the United States and accuracy of merging tokens. For our purposes, a higher level of merger is indicated by a lower accuracy of properly identifying merging tokens.

Figure 5: Length of stay in the U.S. vs. accuracy on merging tokens

### **5.4 Is either of the merging categories more "difficult"?**

Further, we have looked at the identification ratios across sound categories. Postalveolars are split into the affricates (potentially merging with alveolo-palatal series) and fricatives (which do not have corresponding alveolo-palatal fricatives, thus, do not merge with any other existing category). While the identification of [š] (a non-merging category) is, as expected, close to 100%, for all heritage speakers – Generation 1 and 2 – the correct identification of voiceless affricate categories is only slightly above 80%.

Figure 6 shows the perception accuracy by sound type for voiceless sounds. In Figure 7, the sound type accuracy for each of the three sound types, alveolopalatal merging, post-alveolar non-merging, and post-alveolar merging, is represented for the two generations separately. The dashed line represents the perception accuracy of Generation 1, while the solid line represents the accuracy

of Generation 2. The accuracy of the post-alveolar [š] is near 100%, which was expected as this sound is non-merging.

Figure 6: Perception in heritage speakers: alveolo-palatal (in the above graph, represented by 'pre-palatal' merging = [ć], post-alveolar nonmerging = [š], post-alveolar merging = [č] (all heritage participants)

Figure 7: Accuracy of sound types. Key: dashed = Gen 1, solid = Gen 2; Alveolo-palatal (in the above graph, represented by 'pre-palatal') merging = [ć/đ], post-alveolar non-merging = [š], post-alveolar merging = [č/dž]

Contrary to what was expected, post-alveolar affricates and alveolo-palatal affricates reached similar level of accuracy in the identification task, however, the difference in the identification of the two merging categories was not significant,

### Kristina Mihajlović & Małgorzata Ćavar

neither for Generation 1 nor Generation 2. /š/ is correctly identified more often than both alveolo-palatal /ć/ and post-alveolar /č/, for both generations and the difference is statistically relevant, see Table 6.

Table 6: Statistically relevant difference in the perception between sound type for voiceless sounds (OR = odds ratio, CI = confidence interval)


The goodness of the perception of alveolo-palatals is not statistically better or worse than the perception of post-alveolar affricates. This finding does not support the hypothesis that alveolo-palatals are more stable than post-alveolar merging affricates, which was based on the pilot data from an articulatory study by Ćavar et al. (2016). Non-merging categories (posterior fricatives) are sticking away from the rest in terms of accuracy of identification. Non-merging postalveolars are approx. 7.49 times more likely to be identified correctly than merging post-alveolars and approx. 10.96 times more likely to be identified correctly than alveolopalatals.

### **6 Discussion**

Hypothesis 1, which states that Generation 2 speakers would merge more than Generation 1 speakers, is supported by our perception data. This result is not surprising. Merging is expected both because of intra-language tendencies, given that the contrast is typologically relatively uncommon, and second, because of the potential pressure from English, which also has only one place of articulation in the posterior area instead of two.

### 12 Perception of BCS sibilants: Heritage U.S. vs. homeland speakers

Since length of stay in the United States is not correlated to accuracy, hypothesis H2 is not supported by our findings. This is contrary to Veltman's finding from large-scale Spanish data conducted using U.S. Census data that longer residence in the United States is linked with greater language shift (Veltman 2000: 66). We believe that the tendency to merge might be critically influenced by a number of factors apart from the duration of stay in the new country, e.g. relatively high proportion of the use of BCS in Generation 1.

We encountered variation in the usage of BCS by generation of speakers. Generation 1 speakers, on average, report using BCS 42% of the time compared to only 20% for Generation 2 speakers. The more extensive usage in Generation 1 speakers may be a result of the relative language skills in the two languages. Generation 2 speakers, on the other hand, report using English 80% of the time compared to only 58% for Generation 1 speakers. We argue that this increased use of English, among other factors helps also to account for the lower accuracy results in Generation 2 speakers versus Generation 1 speakers. Further, the contrast between alveolo-palatals and post-alveolars is hard-coded into BCS spelling and with any level of education in BCS, it is very prominent and not likely to be obliterated once it is there for a speaker. Generation 2 speakers, per our definition, did not have a chance to participate in the formal education in Croatia-Bosnia-Serbia and were either not exposed or exposed to a lesser degree to the prescriptive norm of the homeland language. The other factors contributing to the lower performance in Generation 2 might include the level of formality in the interaction with the dominant language and the education in the dominant language.

Lastly, with regards to hypothesis H3, the difference between the "goodness" of perception between post-alveolars and alveolo-palatals is not statistically relevant, which does not provide support for the hypothesis in Ćavar et al. (2016). On the other hand, given the statistically relevant difference between merging and non-merging hard post-alveolars, our data provide strong evidence for the progress of merger.

Our results do not support the claim advocated by Polinsky (2018) that immigrant and heritage language varieties tend to retain conservative features, as was the case with early American English being more conservative than British English.<sup>12</sup> Perhaps the deciding factor is the fact that the process of merger had started already in the homeland speech and was "imported" to the U.S. Our re-

<sup>12</sup>This situation of conservatism exists in several languages. See Kang & Nagy (2012) for a discussion of Seoul Korean homeland and heritage speakers exhibiting the expected pattern of conservatism in the aspirated/lenis distinction in stops. Additionally, Thepboriruk (2015) gives an account from heritage Thai in which heritage teen speakers were consistently more conservative than their parents with respect to voiceless aspirated stop affrication.

### Kristina Mihajlović & Małgorzata Ćavar

sults support the opposite claim – that language change is facilitated in the heritage language context. Our study does not, however, provide much evidence as to what may be the factors behind the language change, whether the accelerating factor is the influence of the dominant language, or if it is unimpeded languageinternal systemic pressures that contribute to the faster-rate language change. A difference between the Croatian/Bosnian heritage speakers with a higher level of merger and, on the other hand, Serbian heritage speakers with a lower level of merger indicates that this is due to the homeland dialectal differences, and then, that for the change to be accelerated in the heritage language, it has to be already well in progress in the homeland speech. A study of a heritage language with a similar contrast but no merger in progress in the homeland speech would provide some evidence to support either analysis.

Polinsky (2018) remarks that features which are "useful" in the dominant language tend to be retained in the heritage language. Our study does not contradict this observation. The distinction between alveolo-palatal and post-alveolar affricates is not utilized in English, thus, heritage speakers can afford to abandon the contrast. This perspective, however, assumes a uni-directional impact of the dominant language on the heritage language, a claim which is problematic. As some studies indicate, e.g. Łyskawa (2016), the dominant language of Generation 2 heritage speakers is different from the language spoken by monolingual native speakers of the dominant language, that is, the heritage speech influences the dominant language. The issue deserves further investigation.

The most obvious question is whether heritage speakers merge more than homeland speakers and the answer to this question is that we cannot be sure. Raw numbers comparison indicates that the combined group of both Generation 1 and Generation 2 heritage speakers merge more than "homeland" speakers in the study of Ćavar & Hamann (2011) on the perception of Croatian and Bosnian/Hercegovinian speakers.This is surprising to some extent because Ćavar & Hamann (2011) targeted the areas with the strongest merging dialects to the exclusion of areas with less merging dialectal areas. If we exclude the heritage speakers of Serbian origin, Bosnian/Croatian heritage participants in both Generation 1 and Generation 2 seem to perform worse than Bosnian/Croatian participants in Ćavar & Hamann (2011); see Table 7. 13

One of the reviewers commented that the results of the two studies cannot be compared for the sake of the differences in the methodology, primarily for two reasons: first, because the tokens used in the Ćavar & Hamann (2011) study included Polish sounds, and second, because the homeland study did not include

<sup>13</sup>The homeland speaker data in Table 7 are from Ćavar & Hamann (2011).

### 12 Perception of BCS sibilants: Heritage U.S. vs. homeland speakers


Table 7: Accuracy of responses

participants from Serbia.<sup>14</sup> As for the former criticism, Ćavar & Hamann (2011) have demonstrated that Croatian speakers perceive a categorical difference between hard and soft categories exceptionally well, and this is also the gist of the current heritage study. The Polish contrast between alveolo-palatal affricates and post-alveolars is rendered in a strikingly similar if not identical way as the prototypical rendition of the Croatian contrast, that is, the one from the nonmerging Hercegovina areas. This is to be expected. Phonological inventories with comparable number of phonemes with comparable contrasts tend to be rendered phonetically in a similar way, as discussed, for example, in Boersma & Hamann (2008) and demonstrated by their simulation of the development of sibilant inventories. We admit that the strength of the effect might be attributed to the difference of focus and methodology between the current study and the homeland study. However, we are convinced that the numbers are indicative of a tendency, especially because they are also consistent with the comparison between heritage Generation 1 and Generation 2.

### **7 Conclusions**

This paper discussed the results from a perceptual study with 20 heritage speakers of Bosnian/Croatian/Serbian living in the United States. Our study has shown that heritage speakers have a high ratio of merger. The ratio of merger in the heritage speech is potentially higher than in that of homeland population, but a direct comparison is impossible given the available set of data. A difference has

<sup>14</sup>Like in the current study, Ćavar & Hamann used a forced identification task. They used Praat experiment environment instead of Paradigm, and the tokens were fricatives and affricates of Polish (Boersma & Weenink 2016). The arrangement of the responses on the screen was also the same, with "soft" consonant categories on one side and "hard" consonant responses on the other.

### Kristina Mihajlović & Małgorzata Ćavar

been observed between Generation 1 speakers with lower ratio of merger and Generation 2 with a higher level of merger. No direct correlation between the years of residence and the level of merger has been discovered for Generation 1. The findings of this pilot study contribute to the discussion surrounding heritage speakers and language change. Further research on the production of the merging sound will shed light on the interaction between perception and production in the bilingual heritage speakers. Finally, additional studies on the heritage speakers of other languages with a similar consonantal inventory will provide a commentary on the role of typological factors in the sibilant merger in heritage BCS in the U.S., as opposed to the role of English as the dominant language.

### **Acknowledgements**

We would like to thank the audience of FDSL-12 and the anonymous reviewers for their feedback. Additionally, Kelly Berkson was integral in discussing phonetic aspects of the study. We also thank Goun Lee for the Paradigm experimental template and Michael Frisby from the Indiana University Statistical Counseling Center for his assistance in data analysis. We are grateful to Christian DiCanio for providing the spectral moments script. We also value the time from all our speakers who participated. We would like to thank the reviewers of the first version of the paper.

### **References**


12 Perception of BCS sibilants: Heritage U.S. vs. homeland speakers


### Kristina Mihajlović & Małgorzata Ćavar


### **Chapter 13**

## **General-factual perfectives: On an asymmetry in aspect choice between western and eastern Slavic languages**

Olav Mueller-Reichau

University of Leipzig

The paper addresses the issue of microvariation within Slavic aspect. Specifically, it investigates perfective general-factuals, which appear in Czech and Polish but not in Russian. It is shown that perfective aspect is used in Czech and Polish when the semantics of the VP of the sentence is such that reference is limited to unique events, or when reference to a unique event is contextually determined. Assuming that semantic aspects operate over VP-meanings, it is then argued that the semantics of perfective aspect in Polish and Czech includes a completedness condition and a uniqueness condition whereas the semantics of the Russian perfective, more strongly, encodes target state validity. This difference categorically bans perfective aspect from general-factual contexts in Russian, but not in Czech and Polish.

**Keywords:** microvariation, perfective, general-factual, target state, uniqueness, VP

### **1 Introduction**

The present paper contributes to the discussion of microvariation within the realm of Slavic aspect. As is well-documented, the distribution of perfective and imperfective verb forms among contexts is not constant within the Slavic family (see, among others, Stunová 1991; 1993; Breu 2000; Petruchina 2000; Dickey 2000; 2015; Dickey 2018; Gehrke 2002; Wiemer 2008; Rivero & Arregui 2010; Alvestad 2013; Gattnar 2013; Berger 2013; Arregui et al. 2014; Dübbers 2015; Fortuin & Kamphuis 2015; 2018). Although there is typological reason to speak of "the Slavic-style aspect" (e.g. Dahl 1985; Plungjan 2011), it would be utterly wrong to consider the aspectual systems of the Slavic languages all the same.

Olav Mueller-Reichau. 2018. General-factual perfectives: On an asymmetry in aspect choice between western and eastern Slavic languages. In Denisa Lenertová, Roland Meyer, Radek Šimík & Luka Szucsich (eds.), *Advances in formal Slavic linguistics 2016*, 289–311. Berlin: Language Science Press. DOI:10.5281/zenodo.2545531

### Olav Mueller-Reichau

The pioneering study on microvariation of aspect in Slavic is Dickey (2000). Based on disagreeing patterns of aspect choice (perfective / imperfective), Dickey analyzes the Slavic languages as clustering around two poles on a scale. The western languages represent one pole, the eastern languages the other one. Polish and Serbian and Croatian are diagnozed as occupying an intermediate region, as these languages share properties with languages of the western as well as with languages of the eastern group, see Table 1.

Table 1: Dickey (2000: 5)


Dickey (2015) presents a revision of the 2000 picture. The most important innovation is that the South Slavic languages (apart from Slovene) are no longer classified as members of the western or eastern groups, but are classified separately, see Table 2.

In this paper, I will be concerned with Czech, Polish and Russian. For the present purposes, therefore, the move from Dickey (2000) to Dickey (2015) is by and large irrelevant. What matters is that Czech is treated as a member of the western group, that Russian is counted as an instance of the eastern group, and that Polish is treated as a language sharing properties with both these groups.

More specifically, I will look at the aspectual behavior of these three languages in general-factual usage. General-factual contexts are particularly interesting from a comparative point of view. The Russian-biased general wisdom is that general-factuals call for imperfective aspect. As has been observed, among others, by Dickey (2000), however, there are certain general-factual contexts in which Czech speakers, for instance, resort to perfective forms. The aim of the present study is twofold. The first goal is to describe the kinds of contexts in which the western language Czech displays general-factual perfectives, whereas the eastern language Russian displays general-factual imperfectives. Since the theoretical prediction for "transitional" Polish is unclear, we will always have a look at the choice that speakers of Polish make in the respective cases. As we will

### 13 General-factual perfectives


Table 2: Dickey (2015: 36)

see, and as noted in Dickey (2000: 101), with respect to aspect choice in generalfactual contexts Polish is not "in between", but follows the Czech pattern. The second goal, in turn, is to explain the described differences by tracing them back to differences in the underlying semantics of perfectivity.

The paper is structured as follows: In §2 I will introduce the phenomenon of perfective general-factuals in Czech and Polish. In §3 I will discuss and reject the hypothesis (proposed by Dickey 2000) that these cases can be traced back to underlying achievement verbs. In §4 I will discuss and reject the hypothesis (suggested by Cummins 1987) that the decisive factor is lack of volition. In §5 I will discuss and reject the hypothesis (brought up by myself) that perfective general-factuals are explicable in terms of event uniqueness. In §6, however, I will argue that the uniqueness hypothesis is not entirely on the wrong track, showing that it will produce correct results if it is relativized to the syntactic domain of the VP. In §7 the situation in Russian will be taken into account. I will explain why general-factual contexts are per se incompatible with perfective aspect in Russian, and what this reveals about differences in the semantics of the respective aspectual categories in the western and eastern Slavic languages under consideration. §8 concludes the paper.

Olav Mueller-Reichau

### **2 General-factual perfectives**

Somewhere in the world wide web, a young Russian-speaking lady tells us ten facts about herself.<sup>1</sup> We are invited to read that she prefers to drink tea without sugar (fact 1), that she is 18 years old but feels like 16 (fact 8), that she once started piano lessons but soon quit in favor of choreography (fact 3), and so on. Of relevance for us is fact 6. The young woman is telling us that she has once fallen from a tree. The Russian sentence that she uses to express that is (1):<sup>2</sup>

(1) Ja I padala fell.ipf s from dereva. tree 'I (once) fell from a tree.'

This is a canonic instance of a Russian general-factual imperfective. A similar one is sentence (2), which the young lady uses to convey fact 7:

(2) Na on menja me padal fell.ipf šifer. roof 'I was (once) hit by a piece of roof.'

Russian general-factuals are characterized by reference to a single completed event only vaguely located in past time, with verbal morphology always being imperfective.<sup>3</sup> What is interesting is that, if our young lady was Czech-speaking, she would have used the perfective verb form to convey her message:

(3) Jako as malá small jsem aux spadla fell.pf ze from stromu. tree 'As a child I (once) fell from a tree'

What about Polish? Polish turns out to pattern like Czech:

(4) Jako as dziecko child spadłam fell.pf z from drzewa. tree 'As a child I (once) fell from a tree.'

<sup>1</sup>https://ask.fm/Nailyuta

<sup>2</sup> I reduce grammatical information in the gloss to a relevant minimum. ipf is for imperfective, pf is for perfective aspect. Other abbreviations are explained at the end of the paper.

<sup>3</sup>Note that the definition of general-factuals used here does not cover cases of 'presuppositional' (Grønn 2004) / 'actional' (Padučeva 1996) / 'anaphoric' (Mehlig 2011) imperfectives. Note furthermore that I restrict the scope of the term to past tense contexts, which is debatable.

### 13 General-factual perfectives

The following pair of examples contrasting Polish general-factual perfectives (5) and Russian general-factual imperfectives (6) is taken from Wiemer (2001):


The kind of data discussed so far are described in Dickey (2000: 95ff.). It is important not to overlook that in other cases of general-factuals, Czech and Polish resort to imperfective aspect, just like Russian does. The examples (7) to (9) may serve as illustration.


We saw that Czech and Polish form *perfective* general-factuals, but that they do not always do so. It is only for a subset of general-factuals that these languages deviate from the imperfective coding holding in Russian throughout. The question that arises is: what precisely characterizes the contexts in which speakers of Czech and Polish use perfective forms to denote completed past events only vaguely located in time?

### **3 Achievements?**

The first hypothesis to be discussed stems from Dickey (2000), reemphasized in Dickey (2018). According to Dickey, the use of imperfective aspect in the languages of the western group presupposes a temporal extension of the denoted

### Olav Mueller-Reichau

event. Given this, speakers will have to resort to perfective aspect whenever the predicate of the sentence is based on an achievement verb: "In the west […] the impv forms of achievement verbs are unacceptable in contexts where one otherwise expects the impfv" (Dickey 2000: 124).

The idea may be restated in terms of the following hypothesis.

(10) **Hypothesis H1:** Perfective aspect is used in general-factuals whenever the verb is an achievement verb because achievement verbs do not supply the temporally extended events required by imperfective aspect in Czech and Polish.

This builds on Dickey's general conclusions about aspectual semantic differences between western and eastern languages. According to Dickey (2000: 107–109), the western imperfective expresses the notion of qantitative temporal indefiniteness, characterized as "the assignability of a situation to several points in time". The eastern imperfective, by contrast, expresses the notion of qalitative temporal indefiniteness, which is described as "the non-assignment of a situation to a unique location relative to other states of affairs".

Consider example (4), for instance. Here the predicate is formed on the basis of a lexical verb which is arguably analyzable as characterizing achievement events. Being an achievement, the verb does not supply "several points in time", which is, according to Dickey, a prerequisite for using the western imperfective. Therefore, in this case, the choice of imperfective aspect is no option for the speaker of Polish, and she has to use the perfective instead.

There is, however, counterevidence to Dickey's proposal. To see why, consider the following example from Russian first:

	- B: Da, yes esli if ja I ne not ošibajus', make.mistake odnaždy once zamerzal. froze.over.ipf 'Yes, if I am not mistaken, it once froze over.'

As can be seen and as expected, Russian speakers use imperfective aspect here. Now, as can be seen in (12) and (13), speakers of Czech and speakers of Polish would use perfective aspect when expressing the same thing:

(12) A: A but Niagarský N. vodopád waterfalls někdy ever zamrzl? froze.over.pf 'Did the Niagara Falls ever freeze over?'

13 General-factual perfectives

	- B: Tak, yes jeśli if się refl nie not mylę, mislead kiedyś once zamarzł. froze.over.pf 'Yes, if I am not mistaken, it once froze over.'

Above we saw that, according to Dickey's explanation of general-factual perfectives, the respective predicates are perfective because of a conflict between the meaning of the imperfective and the lexical meaning of the verb, and that the conflict arises with achievement verbs. Accordingly, the reason why (12) and (13) have perfective predicates should be that these predicates are formed from achievement verbs lacking a process component in their lexical-semantic structure. The problem is that, if (12) and (13) were based on verbs lacking such a component, we would not expect these verbs to be (easily) used for denoting ongoing processes. As a matter of fact, however, they may be used in that function, quietly and without fuss. Consider the Polish example in (14):

(14) Jezioro lake zamarza! freeze.over.ipf 'The lake is freezing over.'

The sentence can be found on the internet, written above a photograph that shows a half-frozen lake. It is further elaborated by the following text:<sup>4</sup>

Po raz pierwszy tej zimy woda w Jeziorze Tarnobrzeskim zaczęła zamarzać dalej niż tylko kilkadziesiąt centymetrów od brzegu. ['This winter for the first time the water of Lake Tarnobrzeg froze further than for just some dozens of centimetres from the lakeside.']

Example (14) proves that the Polish predicate meaning 'freeze over' characterizes temporally extended events. Thus, it does supply "several points in time". According to Dickey's reasoning, this implies that the predicate should be lexically capable of taking on imperfective morphology. But then, why does it not show up in the imperfective in (13)?

<sup>4</sup>http://tarnobrzeskie.eu/2016/01/23/jezioro-zamarza-zdjecia/

### Olav Mueller-Reichau

One might, of course, object that the argument misses the point because Polish is not classified as a genuine western language within Dickey's system. Fair enough, but consider the Czech equivalent to (14):

(15) Jezero lake (právě) right.now zamrzá. freeze.over.ipf 'The lake is freezing over (right now).'

There is an alternative way of understanding Dickey's proposal.<sup>5</sup> Maybe the claim is that the sentences (12) and (13) denote achievements *because* they are perfective. Following this suggestion, we should perhaps restate H1:

(16) **Hypothesis H1':** Perfective aspect is used in general-factuals whenever the speaker wants to refer to an achievement event because the use of the imperfective in Czech and Polish is restricted to reference to temporally extended events.

Yet the problem remains. Note that the situations referred to in (12) and (13) *are* temporally extended. As a matter of fact, the freezing over of a waterfall does never happen all of a sudden. It is a very time-consuming process indeed. Given that "in the default conceptualization there is a process component in these situations" (Dickey 2018: 78), H1' predicts that the natural translation of the Russian (11) into Czech or Polish should make use of an imperfective verb form. What is actually chosen, however, is a perfective verb form. This raises the unanswered question: why should the speaker want to present the freezing of the Niagara Falls as an instantaneous event?

I think that it is fair to conclude that, without further modification, Dickey's solution to the puzzle of general-factual perfectives fails to explain cases like (12) and (13).<sup>6</sup>

### **4 Volition?**

The next idea to be discussed has been stated by Cummins (1987) as a generalization to account for the situation in Czech:<sup>7</sup>

<sup>5</sup>Thanks to an anonymous reviewer for pointing that out to me.

<sup>6</sup> Fortuin & Kamphuis (2015) raise a similar concern about Dickey's analysis of the western imperfective.

<sup>7</sup> In the quote, I have replaced Cummins' "constative I" by the synonymous "general-factual imperfective".

### 13 General-factual perfectives

Czech absolutely prohibits the general-factual imperfective in all low-volitional predicates.This restriction admits no exception […]: all Czech generalfactual imperfectives have predicates with high agentivity.

(Cummins 1987:41)

For the sake of the argument let us suppose an intuitive understanding of volition, according to which it is "the cognitive process by which an individual decides on and commits to a particular course of action."<sup>8</sup> Given that, Cummins' law may suggest the following hypothesis.

(17) **Hypothesis H2:** Perfective aspect is used in general-factuals whenever the speaker wants to refer to a non-volitional event because (for some unclear reason) general-factual imperfectives in Czech and Polish are restricted to volitional actions.

This may, indeed, account for the cases that we came across with so far. Sentences like (3) report on accidental events, and accidents are by definiton not accompanied by the individual's decision on the course of events. Also sentences like (12) may be accounted for, as the event participant is inanimate and, hence, void of volition.

Nevertheless, the approach as it stands is not tenable. This has been shown in Dickey (2000: 101–102). Consider the following examples:


These sentences clearly report on volitional actions, and yet the perfective form is used. If lack of volition was the explanation for the use of perfective aspect in general-factual contexts, as the hypothesis H2 suggests, examples like Czech (18) and Polish (19) should not exist. So appealing as it may seem at first sight, we have to look for a better explanation.

<sup>8</sup>https://en.wikipedia.org/wiki/Volition\_(psychology)

Olav Mueller-Reichau

### **5 Uniqueness?**

The third hypothesis that I would like to check may be stated as follows:

(20) **Hypothesis H3:** Perfective aspect is used in general-factuals whenever the speaker wants to refer to an event which is unique in the relevant context because perfectivity semantically expresses uniqueness in Czech and Polish.

To make sense of that, let us assume that the aspectual operators in Czech and Polish have the following semantics:<sup>9</sup>

(21) <sup>J</sup>IPF<sup>K</sup> <sup>=</sup> *λPλt*∃*e*[*P*(*e*) ∧ *<sup>e</sup>* ⃝ *<sup>t</sup>*] <sup>J</sup>PF<sup>K</sup> <sup>=</sup> *λPλt*∃*e*[*P*(*e*) ∧ *<sup>e</sup>* <sup>⊆</sup> *<sup>t</sup>* ∧ ¬∃*<sup>e</sup>* ′ [*P*(*e* ′ ) ∧ *e* ′ , *e*]]

Informally speaking, the PF-operator includes a completedness requirement (*e* ⊆ *t*) as well as a uniqueness condition (¬∃*e* ′ [*P*(*e* ′ ) ∧ *e* ′ , *e*]). The former requires that the denoted event must have reached its culmination point, and the latter requires that there is no possibility or, at least, no expectancy of a second event realization of the same type in the discourse context. The IPF-operator, by contrast, imposes only a very vague condition on interpretation (*e* ⃝ *t*). All that it requires is that the event time should, in this or that way, overlap the reference time (cf. Grønn 2004).

Given these assumptions, why do unique events call for perfectivity? Note that the two operators in (21) are of the same semantic type, differing only in specificity of content (every event that fulfills *e* ⊆ *t* is an event that fulfills *e* ⃝*t*). Therefore, the two aspectual operators may legitimately be analyzed as forming a Horn-scale (Sonnenhauser 2006; 2007). As they are located on a Horn-scale, the use of the less specific imperfective marker will trigger the conversational implicature that the speaker lacks evidence for using the more specific perfective marker. If the speaker wanted to avoid inviting this inference, because she does have sufficient evidence for categorizing the event as completed and unique, she would have to use the perfective. The use of the imperfective would otherwise misinform the hearer by suggesting that the event is either non-unique or noncompleted. Taking into account that the latter option is out in general-factual contexts (as general-factuals always report on completed events, see above), we may rewrite H3 as H3':

<sup>9</sup> For ease of readability, I will not indicate Krifka's (1998) temporal trace function *τ* (*e*), which maps events onto their run time.Thus, wherever *e* is related to *t* in the semantic representations to follow, this is meant to express that *τ* (*e*) is related to *t*.

### 13 General-factual perfectives

(22) **Hypothesis H3':** Perfective aspect is used in general-factuals whenever the speaker wants to refer to an event which is unique in the relevant context because imperfectivity in Czech and Polish general-factuals implies reference to non-unique events.

Note, by the way, that if accidental events imply uniqueness (and I shall argue that they do), Cummins' law ("Czech absolutely prohibits the general-factual imperfective in all low-volitional predicates") may be viewed as a special case: If the expression of a unique, completed event attracts perfective aspect, and if accidents represent a special kind of unique events, then the expression of an accident should likewise attract perfective aspect.

Hypothesis H3 gains further plausibility in view of the fact that necessarily unique events (i.e. cases where world knowledge makes event repetition unlikely) require perfective aspect. Note that these sentences do not represent cases of general-factuals, as general-factuals require the event property to be in principle replicable (e.g. Padučeva 1996: 58).


And yet H3 and H3' are, like the previous hypotheses, confronted with counterevidence. Consider the following Czech dialogue.

	- B: Ano, yes už already mu him odstraňovali took.out.ipf slepé střevo. appendix 'Yes, his appendix has been removed.'

What A and B are talking about here is a non-repeatable, i.e. unique event (everything else would enforce the conceptualization of an absurd scenario where a formerly removed appendix is re-implanted). According to hypothesis H3', this should rule out imperfective aspect in favour of the perfective. Contra to that

Olav Mueller-Reichau

prediction, however, the imperfective appears to be well suited to figure in the Czech example (25).

According to a comment by an anonymous reviewer, the situation in Polish seems to be the same as in Czech:

(26) Czy q mu him {wycinali took.out.ipf / wycięli} took.out.pf ślepą kiszkę? appendix 'Has his appendix been removed?'

Here, too, it is possible to use an imperfective verb form under reference to a completed event, which is in conflict with H3/H3'.<sup>10</sup>

We have to conclude that, as it stands, the uniqueness hypothesis seems to be falsified.

### **6 Uniqueness!**

In this section, I elaborate on hypothesis H3. The idea is to take the syntactic structure of the sentence into account and relativize the *semantic* uniqueness condition to the domain of the AspP. The new hypothesis (which is actually not "new" but merely more precise) will then be (27).

(27) **Hypothesis H4:** Perfective aspect is used in general-factuals whenever the speaker wants to refer to an event which is unique in the relevant context because perfectivity semantically expresses AspP-uniqueness in Czech and Polish.

Let me explain. Above I proposed the denotations stated in (28).

(28) <sup>J</sup>IPF<sup>K</sup> <sup>=</sup> *λPλt*∃*e*[*P*(*e*) ∧ *<sup>e</sup>* ⃝ *<sup>t</sup>*] <sup>J</sup>PF<sup>K</sup> <sup>=</sup> *λPλt*∃*e*[*P*(*e*) ∧ *<sup>e</sup>* <sup>⊆</sup> *<sup>t</sup>* ∧ ¬∃*<sup>e</sup>* ′ [*P*(*e* ′ ) ∧ *e* ′ , *e*]]

Now I remind of that these semantic assumptions presuppose the syntactic assumptions stated in (29):

(29) [… [AspP {PF/IPF} [VP … V … ]]]

<sup>10</sup>According to the reviewer, the use of the perfective form leads to an interpretation involving target state relevance (see §7). It needs to be checked whether target state relevance is indeed obligatory when the perfective is used in (26). If yes: Does it follow from the semantics of the perfective? Then Polish would approximate the Russian pattern. Or does it rather follow from pragmatic inferences, presumably in competition with the imperfective? I must leave this issue open.

### 13 General-factual perfectives

What (28) basically says is that the use of a perfective form will always impose on interpretation the conditions of completedness and uniqueness. What (29) adds to that is that these interpretive conditions enter in above the syntactic level of VP (see Tatevosov 2011; 2015 for a defense). It is thus the semantics of the VP that the functions PF and IPF operate on. Several consequences follow from this kind of grammatical architecture.

The first consequence to be noted here is that if the VP-property entails event uniqueness, perfective aspect will have to be used. This prediction seems to be borne out (for the sake of space I will only use Czech examples):


In (30), the VP-property is one that can be realized only once in a given world. The VP thus narrows down the denotation set to unique events. According to (28) and (29), this strictly calls for the perfective (when presupposing completedness) because the speaker cannot but refer to a unique event. This prediction is in line with the use of perfective aspect observed in (30).

Let me now turn to the second consequence that follows from the above made assumptions, specifically concerning general-factuals. If the VP does not restrict denotation to unique events, then on semantic grounds alone the perfective is neither required nor excluded. Perfective aspect may be used, but *if* it is used, the expression of event uniqueness introduced by it should be pragmatically motivated. Below I present three contexts in which the pragmatic felicity of perfective use is met because expressing uniqueness is what the speaker wants to convey (the list is not meant to be exhaustive).

Context 1: The choice of the perfective expressing uniqueness is felicitous because the speaker wants to refer to an accident. This is the case in (3), repeated here for convenience.

(31) Jako as malá small jsem aux [VP spadla fell.pf ze from stromu]. tree 'As a child I (once) fell from a tree'

In (31), the speaker reports on an accidental event. It lies in the very concept of an accident that it is unexpected. If, unexpectedly, an accident happens to occur once (twice…), it will not be expected to occur a second time (third time…). Given this, communicating the existence of an accident, as in (31), or requesting the existence of an accidental event, as in (5) from above, will always invite an

### Olav Mueller-Reichau

inference of uniqueness. It follows from hypothesis H4 that the perfective is to be used because otherwise the event would be understood as non-unique and, hence, as non-accidental.

Context 2: The choice of the perfective is felicitous because the speaker refers to an action that requires unusual skills. (32), for instance, refers to a dare. It should be read here as an answer to (18):

(32) Ano, yes jako as malá small jsem aux z from toho that prkna diving.board [VP skočila]. jumped.pf 'As a child I (once) jumped from that diving board.'

Here, arguably, the speaker answers the question of whether she has performed an action that (from the point of view of the questioner at least) requires extraordinary courage of those who perform it. Given this, the speaker may assume that the addressee (= questioner) takes the occurrence of such an action as unlikely. Similar to the case of accidents, it then follows that if the speaker states that she has performed the action once, she may be sure not to be expected to having performed it a second time. Thus, the expression of uniqueness, which H4 attributes to the use of perfective aspect, is well grounded in the context of (32).

(33) shows a similar example. Again, the kind of event is such that already one event realization will count as something special (from Fortuin & Kamphuis 2018: 115):

(33) Už already jste aux někdy ever {dal gave.pf / \*dával} gave.pf gól? goal 'Have you ever scored a goal?'

Context 3: The choice of perfective aspect is felicitous because the speaker refers to an extraordinary event, see (34).

(34) V in minulém last století century Niagarský N. vodopád waterfalls [VP zamrzl]. froze.over.pf 'In the last century Niagara Falls froze over.'

It is very difficult to imagine that the Niagara Falls freezes over completely. Thus, already one such event is unexpected. If it turns out to have taken place, we will not expect it to take place a second (let alone third, fourth, …) time. Being about an unlikely event, (34) conveys uniqueness, and the attested choice of perfective aspect is correctly predicted by H4.

Finally, I turn to the case that rendered the hypothesis H3 wrong.

13 General-factual perfectives

= (25)

(35) Ano, yes už already mu him [VP odstraňovali took.out.ipf slepé střevo]. appendix 'Yes, his appendix has been removed.'

As can be seen, I have identified the element *mu* 'him.dat' as being located outside of the VP. This might seem debatable, but see Dvořák (2010) for independent evidence in support of the assumption that benefactive *mu* is base-generated above VP. The point is that, if this syntactic decision can be maintained, the VP of (35) will turn out to supply a property describing a repeatable event. Countless appendisectomies are being carried out at the moment in the hospitals of the world. This does not deny the uniqueness intuition that we feel in view of (35). The intuition is real, but it arguably comes in by semantic composition taking place above VP. Since as a matter of fact every person has at most one appendix, the meaning of *mu* serves as a referential anchor for the otherwise non-specific meaning of *slepé střevo* 'appendix'. As a consequence, once the semantic contribution of *mu* is taken into account, the appendix will be understood to be a specific one. This, in turn, referentially anchors the whole event. What is described now is no longer a repeatable event, but a unique one.

Crucially, our new hypothesis H4 does *not* dictate perfectivity for (35). Since H4 incorporates the assumption that the aspectual operators PF and IPF take VP-meanings as input, and since the VP of (35) does not involve uniqueness, the use of imperfective aspect is not ruled out on semantic grounds. H4 predicts that the imperfective can be used in contexts where the uniqueness of the event is pragmatically irrelevant to what the speaker wants to convey.

Example (36) shows a similar case (adopted from Cummins 1987):

(36) Už already jsem aux večeřel. had.supper.ipf 'I've already had supper.'

Speakers of Czech may refuse an invitation to supper by uttering (36). The utterance will be felt to address a unique event, and the use of imperfective morphology runs counter to the predictions of H3. Hypothesis H4, by contrast, may account for why the imperfective is allowed in that case. Again, we have to pay attention to the VP:

(37) Už already jsem aux [VP večeřel]. had.supper.ipf = (36) 'I've already had supper.'

### Olav Mueller-Reichau

In (37), as in (35), the VP does not describe a unique event. The uniquenessexpectation associated with the sentence likewise enters in above VP, i.e. on account of further information provided by the linguistic and non-linguistic context within which the VP appears. The relevant pieces of information stem from: First, the (dropped) subject, which refers to a specific person as the agent of the event (the speaker). Secondly, the topic time, which is a specific day (today). Thirdly, script-knowledge which says that supper is normally taken once per day. In sum, the VP does not determine the uniqueness of the event in (36), the use of the perfective is therefore not mandatory, and imperfective aspect remains an option according to hypothesis H4.

To sum up, the observations made above amount to the following picture for Czech and Polish (which is valid not only for general-factual contexts):<sup>11</sup>


In the latter case it is upon the speaker to decide on pragmatic grounds whether the denoted event should be understood as unique. If signaling uniqueness was intended by the speaker (because she perhaps wanted to refer to an accident, to a dare or to a sensational news event), she would have to use a perfective verb form. If, on the other hand, uniqueness is not what the speaker wants to signal, she should use an imperfective verb.

### **7 Taking Russian into account**

As we saw above, Czech/Polish and Russian general-factuals do not pattern alike. The story told above takes care of the former languages. I have proposed denotations for PF and IPF in Czech and Polish that predict the aspectual choices made by the speakers of these languages. The open question is: why does Russian deviate from Czech/Polish in general-factual contexts?

My answer to that question follows Stunová (1991) who traces the differing distributions of the aspectual markers in Czech and Russian back to a difference in semantic content of the respective perfective category, the imperfective category

<sup>11</sup>The reader should bear in mind that the exposition presupposes that reference to completed events is intended.

### 13 General-factual perfectives

being treated in Czech as well as in Russian as "an unmarked member of the aspectual opposition" (Stunová 1991: 297). Stunová's (1991) results are summarized in (38):<sup>12</sup>

(38) PFCzech ↝ totality PFRussian ↝ totality + connectedness

I propose to reinterpret the feature of 'totality' as comprising the features (conditions) 'completedness' and 'uniqueness'. Given that move, Stunová's semantics for the Czech perfective will be in perfect harmony with the conclusions that I have arrived at. The remarkable thing is that Stunová's result is derived from empirical observations based on entirely different linguistic "parameters" (in the sense of Dickey 2000) than mine. While I am concerned here with the choice of perfective or imperfective aspect in general-factual contexts, Stunová (1991) discusses aspect choice in sequences of events, in the historical present, in generics and in pluractionals.

Stunová's feature 'connectedness' is adopted from Barentsen (outlined in Barentsen 1995; 1998). According to Barentsen's (1998: 45) informal description, an event is "connected" if it is viewed from the perspective of the changes that it is imposing on its environment. Given this, Barentsen's notion is virtually identical (or at least very similar) to Grønn's (2004) pragmatic notion of target state relevance, which he derives from the semantic condition of target state validity.<sup>13</sup> The notion of target state validity is formally defined by means of the condition *f*END(*t*) ⊆ *f*TARGET(*e*). 14

Given all this, we may rewrite (38) as (39):

(39) PFCzech ↝ completedness + uniqeness PFRussian ↝ completedness + uniqeness + target state validity

Now, formally, (39) may be stated as (40):

$$\begin{aligned} \text{(40)} \quad & \left[ \text{PF}\_{\text{Czech}} \right] = \lambda P \lambda t \exists e [P(e) \land e \subseteq t \land \neg \exists e' [P(e') \land e' \neq e]]\\ & \left[ \text{PF}\_{\text{Russian}} \right] = \lambda P \lambda t \exists e [P(e) \land e \subseteq t \land \neg \exists e' [P(e') \land e' \neq e] \land f\_{\text{E0}}(t) \subseteq f\_{\text{today}}(e)] \end{aligned}$$

<sup>12</sup>It should be noted that the conclusions in Stunová (1993) differ from those in Stunová (1991). <sup>13</sup>Here is where the difference between the two notions lies: while target state relevance determines that the event produces an occasion for subsequent events, connectedness is more broadly construed allowing alternatively for that the event starts from the final state created by a preceding event; see Dickey (2018: 81ff.) for discussion on that point.

<sup>14</sup>The condition *<sup>f</sup>*END(*t*) ⊆ *<sup>f</sup>*TARGET(*e*) requires the reference time to end when the target state is in force.

### Olav Mueller-Reichau

The "semantically unmarked" imperfective will be the same in all of the discussed languages:<sup>15</sup>

(41) <sup>J</sup>IPF<sup>K</sup> <sup>=</sup> *λPλt*∃*e*[*P*(*e*) ∧ *<sup>e</sup>* ⃝ *<sup>t</sup>*]

In (40), target state validity is implemented in the Russian perfective operator as an additional condition besides completedness and uniqueness. It should be noted, however, that the semantic content of target state validity by itself implies the conditions of uniqueness and completedness (Mittwoch 2008: 342–344). Accordingly, (40) may be reduced to (42):

(42) <sup>J</sup>PFCzech<sup>K</sup> <sup>=</sup> *λPλt*∃*e*[*P*(*e*) ∧ *<sup>e</sup>* <sup>⊆</sup> *<sup>t</sup>* ∧ ¬∃*<sup>e</sup>* ′ [*P*(*e* ′ ) ∧ *e* ′ , *e*]] <sup>J</sup>PFRussian<sup>K</sup> <sup>=</sup> *λPλt*∃*e*[*P*(*e*) ∧ *<sup>f</sup>*END(*t*) ⊆ *<sup>f</sup>*TARGET(*e*)]

Now back to the initial question of why Russian deviates from the Czech and Polish pattern in the way it does. The answer is that, given the Russian perfective operator as stated in (42), it will be ruled out for semantic reasons in any generalfactual context. The condition of target state validity, and thus the perfective operator, is per se incompatible with general-factuals. To meet the condition of target state validity, the event has to have a specific reference time. Generalfactuals, by contrast, require the event to be located in a reference time which is "big and floating" (Grønn 2004: 273; see Mueller-Reichau 2016 for an explanation as to why this is so).

The incompatibility of general-factual interpretations and target state validity being associated with perfective aspect is, crucially, independent of whether or not the denoted event is unique. This is a non-trivial result, as it runs counter to Dickey (2018)'s claim that "[t]he only way to establish that an event […] is unique in time is to specify the temporal (and causal) context of the event in question. And this can only be done by providing information about prior and subsequent situations".

An event that has a specific reference time is necessarily unique, but a unique event does not have to have a specific reference time. This is what sets Russian apart from Czech and Polish, i.e. why Russian excludes general-factual perfectives, whereas Czech and Polish allow for them under the described circumstances.

<sup>15</sup>I wish to point out that under the proposed analysis (which closely follows Grønn 2004) the imperfective is, in fact, not unmarked/unspecified, but rather radically *under*specified in comparison to the perfective. Thus, the approach is *not* Jakobsonian. The meanings in (40) and (41) all represent Hauptbedeutungen in the sense of Kuryłowicz (1960: 178).

### 13 General-factual perfectives

### **8 Conclusions**

In this paper, I have addressed the variation in aspect choice in general-factual contexts between Czech, Polish and Russian. I have argued that the asymmetry between Czech and Polish on the one hand, and Russian on the other hand, should be related to a difference in the semantics of the respective perfective operators. While perfectivity in the former languages introduces the condition that the denoted event is completed and unique, perfectivity in Russian more strongly requires that the reference time ends when the target state is in force. The imperfective operator is in each of these languages semantically vague in that it requires no more than that the reference time overlaps the event time.

I have shown that my conclusions are in line with much of the existing descriptive and theoretical literature on Slavic aspect. Specifically, I have made a case for the following claims:


Still, many questions remain open. How do the generalizations that I derived from general-factual contexts agree with the patterns of aspectual variation observed in other contexts ("parameters")? The closeness to Stunová's results gives rise to optimism, but these things have to be checked.

I wish to conclude with a further argument that one might bring forward in support of the story told in this paper. Dickey (2000: 112) reports that the Polish perfective in (43) is possible given the following scenario: The speaker, who

### Olav Mueller-Reichau

had instructed the hearer to air the room beforehand, has entered the room, the hearer is around, and the (only) window is closed at the moment. This possibility of perfective aspect is in sharp contrast to the case of Russian, where the use of a perfective verb would strictly require the window to be open.

(43) Czy q otworzyłeś opened.pf już already okno? window 'Did you already open the window?'

Drawing on a suggestion made by Dickey (2018: 84), I speculate that the absence of target state validity in the Polish perfective operator provides the reason why the perfective is usable here despite result annullment, and that the significance of uniqueness (that there is the expectancy of a single event realization) explains why the perfective is indeed used in the particular context at hand. This points to but one out of many intriguing issues that await investigation in the field of inner-Slavic aspectual variation.

### **Abbreviations**


### **Acknowledgements**

I would like to thank the audience of FDSL 12 and two anonymous reviewers for valuable feedback. My gratitude also goes to Anna Artwińska, Petr Biskup, Mojmír Dočekal, Małgorzata Gałaś-Prokopf, Wojciech Roskiewicz, Danuta Rytel-Schwarz, Yulia Sorokina, Lenka Vávrová, Marcin Wągiel, Maria Yastrebova and Dagmar Žídková-Gunter. Finally I wish to thank the editors of this volume and Radek Šimík in particular for thoughtful and instructive comments. All remaining shortcomings are, of course, my own.

### **References**


### 13 General-factual perfectives


### Olav Mueller-Reichau


13 General-factual perfectives


### **Chapter 14**

## **Gender encoding on hybrid nouns in Bosnian/Croatian/Serbian: Experimental evidence from ellipsis**

Andrew Murphy University of Leipzig

Zorica Puškar Leibniz-Zentrum Allgemeine Sprachwissenschaft, Berlin

Matías Guzmán Naranjo

Heinrich Heine University Düsseldorf

In this paper, we report the results of an experimental study on the possibility of gender mismatches with ellipsis of a particular type of hybrid nouns in Bosnian / Croatian / Serbian (henceforth: BCS). Using agreement mismatches under NP ellipsis as a diagnostic for gender feature specification of hybrid nouns, we show that these nouns disallow agreement mismatches under NP ellipsis if natural gender (and a presupposition that it introduces) is present either in the antecedent or the ellipsis site. We argue that natural gender feature (masculine in our study) can optionally be present on the hybrid noun, and that its inclusion in ellipsis contexts leads to a violation of the standard identity requirement between the antecedent and the ellipsis site, namely Merchant's (2001) e-givenness.

**Keywords:** gender features, ellipsis, agreement, mismatches

### **1 Introduction: Hybrid nouns in BCS**

Hybrid nouns have been a challenge for theories of agreement and NP structure (see Corbett 1991; Wechsler & Zlatić 2003; Alsina & Arsenijević 2012b,a; Peset-

Andrew Murphy, Zorica Puškar & Matías Guzmán Naranjo. 2018. Gender encoding on hybrid nouns in Bosnian/Croatian/Serbian: Experimental evidence from ellipsis. In Denisa Lenertová, Roland Meyer, Radek Šimík & Luka Szucsich (eds.), *Advances in formal Slavic linguistics 2016*, 313–336. Berlin: Language Science Press. DOI:10.5281/zenodo.2545533

### Andrew Murphy, Zorica Puškar & Matías Guzmán Naranjo

sky 2013; Kramer 2015; Landau 2016; Smith 2015; 2016; Arsenijević & Gračanin-Yuksek 2015; Despić 2017), as it seems that they can simultaneously bear two types of gender specification: (i) natural gender (reflecting the gender of the referent) and (ii) grammatical gender (assigned arbitrarily). In this paper, we focus only on the Class II hybrid nouns in BCS (ending in *-a*) which have grammatical feminine gender, but variable natural gender, which depends on the discourse referent.

Table 1: Some hybrid nouns in BCS


For instance, with a masculine referent, the noun *mušterija* ('customer') can trigger either grammatical feminine (1a)/(1b) or natural masculine agreement (1c)/(1d) (subject to speaker variability).

	- b. Nov-a new-f mušterija customer je is kupila bought.f jaknu. jacket. 'A new (male or female) customer bought a jacket.'
	- c. % Milan Milan nam us je is nov-i new-m mušterija. customer 'Milan is our new customer.'
	- d. % Nov-i new-m mušterija customer je is kupio bought.m jaknu. jacket. 'A new (male) customer bought a jacket.'

With female referents, however, such nouns can only trigger feminine agreement:

(2) Marija Marija nam us je is nov-a new-f / \*nov-i new-m mušterija. customer 'Marija is our new customer.'

### 14 Gender encoding on hybrid nouns in Bosnian/Croatian/Serbian

One could treat these as so-called "epicene" nouns of the type found e.g. in Brazilian Portuguese and Greek, which can be used with both masculine and feminine referents without change in form (Bobaljik & Zocca 2011; Merchant 2014; Kramer 2015). It has been proposed that such nouns are simply listed in the lexicon twice, each with a different gender feature (e.g. Merchant 2014: 19). However, such an approach to BCS hybrid nouns seems to be problematic, as there is evidence that these nouns can *simultaneously* bear natural and grammatical gender. For example, in (3) the adjective and determiner target different gender values of the noun, while (4) shows the same for the nominal modifier and the participle.<sup>1</sup>


(4) Osm-a eighth- f.sg budala fool.sg je is bio-∅ been- m.sg mnogo very kul cool tip guy ali but su are ga him se refl drugi others malo little plašili.<sup>2</sup> feared 'The eighth fool was a very cool guy, but others were a bit afraid of him.'

These examples raise the question about the structural representation of such nouns: whether they do contain both types of gender feature simultaneously, and how exatly they should be represented. In what follows, we will answer these questions by using NP ellipsis as a diagnostic for the gender feature specification of BCS hybrid nouns.

<sup>1</sup>Relative pronouns can also show different gender values than the attributive modifiers of a hybrid noun, as shown in (i), obtained at http://vukajlija.com/seoski-fudbalski-tim-iz-betonlige <accessed 26.11.2016>.

<sup>(</sup>i) Lokaln-a local- f.sg pijanica, drunkard, koj-i who- m.sg je is završio finished.m.sg sa with igranjem playing fudbala… football 'A local drunkard, who's finished playing football…'

However, as pointed out by a reviewer, whether the relative pronoun agrees directly with the hybrid noun, or whether agreement is more indirect (since we are dealing with a nonrestrictive relative clause in which the relative pronoun is more akin to regular pronouns, c.f. de Vries 2006) is an issue that requires further investigation. See Arsenijević & Gračanin-Yuksek (2015) for further discussion on hybrid agreement with relative pronouns.

<sup>2</sup>http://magdajanjic.tumblr.com/post/85348961537/budala <accessed 26.11.2016>.

Andrew Murphy, Zorica Puškar & Matías Guzmán Naranjo

### **2 Gender mismatches and NP ellipsis**

There is a growing body of literature on the permissibility of gender mismatches under NP ellipsis (e.g.Nunes & Zocca 2010; Bobaljik & Zocca 2011; Merchant 2014; Sudo & Spathas 2016; Barrie 2016). Based on whether gender mismatches under NP ellipsis are allowed or not, previous literature has identified three classes of nouns. The first type are the two-way alternating nouns (henceforth: the *doctor*-class), where a masculine antecedent can license deletion of a noun with a feminine referent, and *vice versa:*

	- a. O the Petros Petros ine is kalos good.m jatros, doctor ala but i the Maria Maria ine is mia a.f kakia bad.f ⟨jatros⟩. doctor 'Petros is a good doctor, but Maria is a bad one.'
	- b. I the Maria Maria ine is kali good.f jatros, doctor ala but o the Petros Petros ine is enas a.m kakos bad.m ⟨jatros⟩. doctor 'Maria is a good doctor, but Petros is a bad one.'

(Greek; Merchant 2014: 15)

The second type are the non-alternating nouns (henceforth: *brother*-class), which do not allow mismatches in either direction:

### (6) **Non-alternating nouns** (*brother*-class)

a. \* O the Petros Petros ine is kalos good.m adherfos, brother ala but i the Maria Maria ine is mia a.f kakia bad.f ⟨adherfi⟩. sister

Intended: 'Petros is a good brother, but Maria is a bad one (sister).'

b. \* I the Maria Maria ine is kali good.f adherfi, sister ala but o the Petros Petros ine is enes a.m kakos bad.m ⟨adherfos⟩. brother Intended: 'Petros is a good brother, but Maria is a bad one (sister).' (Greek; Merchant 2014: 12)

Finally, one-way alternating nouns (or the *actor*-class) allow a masculine noun to antecede an elided feminine noun (7a), but a mismatch in the opposite direction is ungrammatical (7b).

### 14 Gender encoding on hybrid nouns in Bosnian/Croatian/Serbian

### (7) **One-way alternating nouns** (*actor*-class)


With regard to their tolerance for gender mismatches, these three classes can be summarized as in Table 2:

Table 2: Classes of predicative nouns under ellipsis (cf. Bobaljik & Zocca 2011: 162)


We would expect that hybrid nouns in BCS pattern with one of these types. However, the additional complication with hybrid nouns is that there is evidence for the simultaneous presence of two gender features. Thus, we will try to gain some insight into their structure by testing the acceptability of gender mismatches with mismatched referents in either direction (8b)/(8c).

	- b. ? Milan Milan mu him je is star-**i** old- m mušterija, customer a and Marija Marija mu him je is nov-**a** new- f ⟨mušt.⟩. cust. 'Milan is his old customer and Marija a new one.'
	- c. ? Marija Marija mu him je is star-**a** old- f mušterija, customer a and Milan Milan mu him je is nov-**i** new- m ⟨mušt.⟩. cust. 'Marija is his old customer and Milan a new one.'

Andrew Murphy, Zorica Puškar & Matías Guzmán Naranjo

### **3 Experiment**

The aim of the experiment was to discover whether an agreement mismatch was tolerated when masculine agreement was found either the antecedent (8b) or the ellipsis site (8c), i.e. whether the hybrid nouns under study were two-way alternating, one-way alternating, or allowed no alternation at all when the noun had natural masculine gender (indicated by the agreement on the adjective). Considering previous theories (Nunes & Zocca 2010; Bobaljik & Zocca 2011; Merchant 2014; Sudo & Spathas 2016), the (im)possibility of a mismatch should be an indicator of (i) the difference in the quality of gender features (i.e. differences in ther semantics or morphosyntactic representation) and (ii) the licensing conditions on ellipsis (i.e. identity requirements between the antecedent and the ellipsis site). Finally, sentences in which both adjectives show feminine agreement, regardless of the gender of the subject, were expected to be grammatical, as the hybrid nouns in question have grammatical gender as a formal feature.

In order to verify the grammaticality of gender mismatches under NP ellipsis, we ran an online acceptability judgement study.<sup>3</sup> The experimental design involved the factors in (9). The first factor involves the type of agreement the adjective has with a masculine subject and has two levels (grammatical agreement mf vs. natural agreement mm) (9a), which should serve as an indicator of the type of gender on the hybrid noun (grammatical vs. natural). The second factor pertains to agreement with feminine subjects and has only a single level (ff, both grammatical and natural agreement) (9b). The final factor regards the position of the masculine referent and has two levels: in the first clause, or in the second clause (the one with NP ellipsis) (9c), which should indicate whether natural gender can function as an antecedent for ellipsis and whether it can be found in the ellipsis site. The 2×2×1 combination of factors in (9) yield the four experimental conditions in Table 3.

	- a. agreement with masculine subject; two levels: grammatical (mf) and natural (mm)
	- b. agreement with feminine subject; one level: ff
	- c. clause; two levels: first and second

Some example test sentences for each condition are given below. Each contains two clauses coordinated by the conjunction *a* 'and' and an elided noun in the

<sup>3</sup> For a detailed description of the design and for all the materials see https://osf.io/r3npz.

### 14 Gender encoding on hybrid nouns in Bosnian/Croatian/Serbian


Table 3: Experimental conditions

second conjunct. Sentences (10) and (12) were expected to be perceived as grammatical, while (11) and (13) were the relevant mismatching cases.


Since masculine agreement with hybrid nouns is not accepted by all speakers, a control sentence involved only masculine referents and agreement (i.e. the condition mmmm) (14).


Furthermore, we included the following additional controls: a grammatical baseline with feminine agreement and all feminine referents ffff (15), and an ungrammatical baseline fmfm, involving masculine agreement with feminine referents (16).

### Andrew Murphy, Zorica Puškar & Matías Guzmán Naranjo


Figure 1: Example experimental item


A total of 18 controls were included, 6 per combination (all of them the same in every list, see below). There were 96 test sentences altogether, with 24 sentences per condition. All lexical items were balanced for proper names (24 male, 24 female), adjectives (48) and hybrid nouns (6). We used a Latin square design where the 96 test sentences were distributed in 4 lists, such that each list contained different items for every condition. Each participant thus saw only items from one list, i.e. 24 test items, 6 items per condition. The experiment was coded using LimeSurvey<sup>4</sup> and run online via the LimeService platform. Sentences were presented one-by-one in a random order. Each participant saw 62 sentences (24 test items + 18 controls + 20 fillers) and was asked to give a grammaticality judgement on a 7-point Likert scale (1 = completely bad, 7 = sounds excellent) by dragging a centered slider either to the left or right to indicate their response (see Figure 1).

The experiment was performed by 164 volunteers, 131 female and 33 male, aged 16–66. Participants reportedly spoke different varieties of BCS: Bosnian (22 speakers), Croatian (5 speakers) and Serbian (136 speakers). None of the participants were paid or otherwise compensated for their participation.

### 14 Gender encoding on hybrid nouns in Bosnian/Croatian/Serbian

Figure 2: All responses by all participants.

Figure 3: All responses by all participants according to whether the speaker liked or disliked the sentences in the mmmm condition.

Andrew Murphy, Zorica Puškar & Matías Guzmán Naranjo

### **4 Results**

Figure 2 shows the distribution of responses under each condition from all participants for the baselines (first column) and our four experimental conditions (second and third column). The baselines show strong grammaticality effects: feminine agreement with feminine subjects was rated as grammatical, while masculine agreement was unacceptable. A u-shaped type of distribution for mmmm (masculine agreement with masculine subjects) suggests that although the majority of speakers dispreferred it, some speakers clearly did find it grammatical. With mismatching subjects, feminine agreement (ffmf and mfff) was rated as grammatical, as shown in the second column. More gradient (un)acceptability was found for ffmm and mmff (third column).

We compared the responses for all conditions based on whether speakers liked the mmmm combination (median rating ≥ 4) or disliked it (Figure 3). In total, 51 speakers found the mmmm combination grammatical. The overall picture shows that the distributions concerning the grammatical patterns are fundamentally the same, regardless of whether the speaker liked mmmm or not (the first two rows). However, there was in fact a difference for conditions with low acceptability scores (ffmm and mmff), as it can be seen in the final row of Figure 3. Speakers who liked mmmm showed no clear preference or dispreference for either of the mismatching combinations – they were perceived as equally bad.

Figure 4 shows the four crucial conditions just for those speakers who rated the mmmm baseline higher than 4. We see that the patterns with feminine agreement throughout were overwhelmingly acceptable and grammatical to these speakers, while the conditions with mismatches show more variation and received comparably lower scores.

To clarify whether the differences in these responses were statistically significant, we fitted an ordinal regression model with only condition as a dependent variable, and participant and hybrid\_noun as random effects (see the Appendix for further details about the model).<sup>5</sup>

As the plot in Figure 5 shows, the factors with overlapping confidence intervals are not statistically different from each other. We see that ffmf and mfff slightly overlap with 0, which means that they are not statistically different from the intercept (the grammatical baseline ffff). On the other hand, ffmm and mmff

<sup>4</sup>https://limesurvey.org

<sup>5</sup>Ordinal regression assumes an ordered discrete response variable. This is exactly the kind of data one obtains from grammaticality judgement tasks. In the models, gender and region did not play a role.

### 14 Gender encoding on hybrid nouns in Bosnian/Croatian/Serbian

are statistically worse than the intercept, but not different from each other or the ungrammatical baseline fmfm.

The conclusion we can draw from this is that gender mismatches are possible in either direction as long as the adjectival agreement is feminine. This is shown in (17) and (18).

### (17) **Two-way mismatches possible with feminine agreement**


### (18) **No mismatch possible with masculine agreement**


Figure 4: Responses by participants who liked mmmm (median rating ≥ 4)

Figure 5: Posterior means and 95% confidence intervals for the Bayesian regression model.

Thus, with regard to the noun classes in Table 2, our hybrid nouns behave like nouns of the two-way alternating *actor*-class when there is agreement in grammatical gender (feminine), but they behave like non-alternating nouns of the *brother*-class if at least one of them bears natural gender (masculine).

### **5 Analysis**

First, let us assume that the identity requirement for elided material involves Merchant's (2001) e-givenness (i.e. mutual entailment), as defined in (19), where existential (∃)-type shifting is "a type-shifting operation that raises the expressions of type *t* and existentially binds unfilled arguments" while F-closure of *α* "is the result of replacing F-marked parts of *α* with ∃-bound variables of the appropriate type (modulo ∃-type shifting)" (Merchant 2001: 14):

	- (i) A entails F-clo(E), and
	- (ii) E entails F-clo(A).

### 14 Gender encoding on hybrid nouns in Bosnian/Croatian/Serbian

This condition requires that mutual entailment holds between the antecedent and the ellipsis site. This prevents semantically-equivalent, non-matching ellipsis sites (20b) Merchant (2001: 27).

	- b. # Abby *called Ben an idiot* after Mary did ⟨insult Ben⟩. (∃*x* . *x* called Ben an idiot ↮ ∃*x* . *x* insulted Ben)

Following Cooper (1983), it is often assumed that (natural) gender features can introduce presuppositions (also see Sauerland 2003; 2008; Heim 2008; Kratzer 2009; Spathas 2010; Sudo 2012). In Greek, non-alternating nouns of the *brother*class have been claimed to contain a presupposition about gender of the referent (21) (Merchant 2014: 19, Sudo & Spathas 2016: 715).


Intended: 'Petros is a good brother, but Maria is a bad one (sister).'

b. \* I the Maria Maria ine is kali good.f adherfi, sister ala but o the Petros Petros ine is enes a.m kakos bad.m ⟨adherfos⟩. brother Intended: 'Petros is a good brother, but Maria is a bad one (sister).' (Greek; Merchant 2014: 12)

The elided and antecedent noun have conflicting gender presuppositions, not mutually-entailing:

(23) ∃*x* : male(*x*). sibling(*x*) ↮ ∃*x* : female(*x*). sibling(*x*)

In the analysis of two-way alternating "epicene" nouns, however, it is often assumed that they do not contain any lexical presuppositions about gender (24), and thus ellipsis is licensed (26).

(24) <sup>J</sup>jatros<sup>K</sup> <sup>=</sup> *λx<sup>e</sup>* . doctor(*x*)

Andrew Murphy, Zorica Puškar & Matías Guzmán Naranjo

	- b. I the Maria Maria ine is kali good.f jatros, doctor ala but o the Petros Petros ine is enas a.m kakos bad.m ⟨jatros⟩. doctor 'Maria is a good doctor, but Petros is a bad one.'

(Greek; Merchant 2014: 15)

(26) ∃*x* . doctor(*x*) ↔ ∃*x* . doctor(*x*)

If we adopt a similar approach for hybrid nouns in BCS, then hybrid nouns agreeing in grammatical gender (feminine) do not contain any presupposition about gender, while the hybrid nouns that agree in natural gender (masculine) should have an additional gender presupposition. Assuming that the grammatical feminine gender of hybrid nouns does not contribute any gender presupposition, this would make it compatible with male and female referents:

$$(27)\quad \left[\![n\_{\left[\mathbf{r}\right]}]\right] \\ = \lambda P \lambda \mathbf{x} \cdot P(\mathbf{x})$$

For masculine-agreeing hybrid nouns, let us assume that the denotation of natural gender is a partial identity function restricting the set of customers to the set of male customers (Cooper 1983).

$$\text{(28)} \quad \left[\mathbb{I}n\_{\text{[M]}}\right] \\ = \lambda P \lambda \\ \text{x : } \text{MAE}(\mathbf{x}) . P(\mathbf{x})$$

The introduction of a gender presupposition with masculine agreeing hybrid nouns makes them incompatible with female referents.<sup>6</sup>

	- b. \* Marija Marija nam us je is nov-i new-m mušterija. customer 'Marija is our new customer.'

a. <sup>J</sup>(29a)<sup>K</sup> <sup>=</sup> customer(Marija)

<sup>6</sup>While other theories rule out such examples based on competition with the feminine agreeing form (e.g. Maximize Presupposition; Bobaljik & Zocca 2011: 148f. or Principle of Gender Competition; Sudo & Spathas 2016: 722), we argue that the syntactic presence of both features is necessary based on instances of mixed agreement, such as (3).

### 14 Gender encoding on hybrid nouns in Bosnian/Croatian/Serbian

b. <sup>J</sup>(29b)<sup>K</sup> <sup>=</sup> customer(Marija), defined only if Marija is male *(presupposition failure!)*

Returning to gender mismatches under ellipsis, in an example such as (31), involving feminine agreement in both clauses, both instances of *mušterija* will lack natural gender feature and the corresponding presupposition.

(31) Milica Milica je is povremen-a occasional-f mušterija, customer a and Jovan Jovan redovn-a regular-f ⟨mušterija⟩. customer 'Milica is an occasional customer and Jovan a regular one.' ffmf

In order to analyze the ellipsis patterns, we adopt the assumptions of Distributed Morphology that nouns are built up from a category neutral root and a head *n* that categorizes this root as nominal (Halle & Marantz 1993; Harley & Noyer 1999; Kihm 2005;Acquaviva 2009; Kramer 2015). Following Kramer (2015), we will treat gender for now as a feature introduced by *n* (although see below for more detail about the different possibilities of simultaneous separate structural encoding of natural and grammatical gender on hybrid nouns). We assume further that number features of nouns are introduced on a NumP. These assumptions yield the structure for (31) as given in (32), where ellipsis is triggered by an [E] feature on Num (e.g. Merchant 2014; Saab & Lipták 2016; Saab 2019).

Importantly, from the point of view of the e-givenness condition in (19), mutual entailment is trivially satisfied since the elided material is identical.

Andrew Murphy, Zorica Puškar & Matías Guzmán Naranjo

However, as soon as we have masculine agreement on one of the conjuncts, we necessarily have the relevant natural gender feature [m] of the referent, making the ellipsis unacceptable (34). The relevant structure, demonstrating the lack of feature identity, is given in (35).

(34) \* Milica Milica je is povremen-a occasional-f mušterija, customer a and Jovan Jovan redovn-i regular-m ⟨mušterija⟩. customer 'Milica is an occasional customer and Jovan a regular one.' ffmm

Recall that the masculine feature also introduces the presupposition that the referent is male. If there is no corresponding feature in the antecedent, then egivenness is violated because there being a customer does not entail there being a male customer:

The same situation holds if masculine agreement obtains in the antecedent:

### 14 Gender encoding on hybrid nouns in Bosnian/Croatian/Serbian

(37) \* Jovan Jovan je is redovn-i regular-m mušterija, customer a but Milica Milica povremen-a occasional-f ⟨mušterija⟩. customer 'Jovan is a regular customer and Milica an occasional one.' mmff

Since *mutual* entailment is required, the existence of masculine gender at only one of the conjuncts results in ungrammaticality, since the e-givenness requirement is not met.

This accounts for why mismatches in referent gender are not tolerated if the adjective agrees in masculine in one conjunct only. As we expect, if we have two masculine referents (39), then mutual entailment is restored (40).

(39) Uroš Uroš je is redovn-i regular-m mušterija, customer a and Tomislav Tomislav povremen-i occasional-m ⟨mušterija⟩. customer 'Uroš is a regular customer and Tomislav an occasional one.' mmmm

Exactly why masculine agreement with hybrid nouns (39) is not possible for all speakers is an open issue, and we suggest it could be due to not all speakers having the variant of *n* containing the additional [m] feature. We leave further examination of interspeaker variation to future research.

### **5.1 The encoding of natural and grammatical gender features**

Since the nouns addressed in this paper have the possibility to simultaneously encode both the grammatical feminine and the natural masculine gender (as illustrated by (3), see also Wechsler & Zlatić 2003; Despić 2017; Puškar 2017), a question that remains open is where exactly these features are encoded in the DP structure.

### Andrew Murphy, Zorica Puškar & Matías Guzmán Naranjo

Some accounts (Matushansky 2013; Pesetsky 2013; Landau 2016) assume grammatical gender to be encoded low in the noun's structure, as a property related to the nominal stem. Natural gender is optionally introduced on a higher functional projection. Under this approach, we would treat the grammatical feminine gender of our hybrid nouns as a property of *n*, while natural gender would be introduced at a higher functional projection (e.g. GenP, cf. Picallo 1991). The lower gender would not introduce any gender presuppositions, while the denotation of the gender on Gen would be a partial identity function restricting the set of customers to the set of male customers (Cooper 1983). If the male-referring hybrid noun triggers feminine agreement, adjectival concord then targets the grammatical gender feature on *n* (41).

For masculine-agreeing hybrid nouns, the closest target for Agree will be the higher masculine gender (42).

### 14 Gender encoding on hybrid nouns in Bosnian/Croatian/Serbian

$$\begin{aligned} \begin{bmatrix} \text{nP} \end{bmatrix} \quad &= \quad \lambda \mathbf{x} \text{ . \textbf{custro\` meas}}(\mathbf{x}) \\\\ \begin{bmatrix} \text{GenP} \end{bmatrix} \quad &= \quad \begin{bmatrix} \text{[Gen]}([\text{nP}]) \\\\ \quad & [\lambda P \lambda \mathbf{x} : \text{MAE}(\mathbf{x}) . P(\mathbf{x})](\lambda \mathbf{x} . \textbf{custro\` Dem}(\mathbf{x})) \\\\ &= \quad \lambda \mathbf{x} : \text{MAE}(\mathbf{x}) . [\lambda \mathbf{x}' . \textbf{custro\` Dem}(\mathbf{x}')](\mathbf{x}) \\\\ &= \quad \lambda \mathbf{x} : \text{MAE}(\mathbf{x}) . \textbf{custro\` Dem}(\mathbf{x}) \end{aligned} \end{aligned}$$

Ellipsis would be licensed if neither the antecendent nor the elided noun project the GenP, or if both of them do. The existence of GenP and the masculine feature in either the antecedent or the ellipsis site would lead to a lack of mutual entailment and the concomitant impossibility of ellipsis, in the manner proposed above.

Another theoretical option would be to make the GenP the locus of grammatical gender, and the *n* of the natural one for the hybrid nouns of this particular type (see Puškar 2017; 2018). A hybrid noun with no natural gender would be represented as (43), while the noun with the natural gender feature would have the structure in (44). Under this approach, the natural gender feature of *n* would have a more complex syntactic representation than the grammatical one. The gender probe is relativized towards the more complex gender. If this gender is present on the *n*, it will be the preferred goal for Agree (44). If it is absent, Agree would copy the higher grammatical gender (43) (see Puškar 2017 for further detail).

If this analysis were to be adopted, the ungrammatical sentences would be ruled out by prohibiting any mismatches between the features of antecedent and those of the ellipsis site (cf. Merchant 2013).

To sum up, both types of approaches to the structural position of gender features would in principle be compatible with the results of our experiment. Future

### Andrew Murphy, Zorica Puškar & Matías Guzmán Naranjo

research should then be tasked with teasing apart the two and defining the exact locus of the natural and grammatical gender features. However, what the results undoubtedly reveal is the necessity to represent the two types of features separately, as well as that they differ in quality, such that natural gender has an additional meaning component.

### **6 Conclusion**

This paper has presented new experimental data from NP ellipsis showing that hybrid nouns in BCS show a split behaviour with regard to gender ellipsis: if they agree in feminine gender, then mismatches in referent gender are permitted for either the antecedent or elided noun. However, masculine agreement in either conjunct blocks gender mismatches. We have linked this to the optional presence of a natural [m] gender feature which represents a target for adjectival concord and introduces a gender presupposition. It is this gender presupposition that destroys the mutual entailment relation required by Merchant's (2001) e-givenness requirement on ellipsis licensing. Consequently, this study suggest that natural masculine gender is syntactically absent when the adjective agrees in grammatical gender.

### **Abbreviations**


### **Acknowledgements**

We would like to thank Ana Bosnić and Joanna Zaleska for the help with the experimental design and coding, as well as Boban Arsenijević, Jelena Stojković, Andrew Nevins and the audience of FDSL 12 for the valuable comments and feedback.

### **Appendix**

We use an ordinal Bayesian regression model with the package MCMCglmm (Hadfield 2010) in R (R Core Team 2016). We used non-informative priors (an inverse

### 14 Gender encoding on hybrid nouns in Bosnian/Croatian/Serbian

gamma with V=1 and nu=0.002). Table 4 presents the posterior mean estimates, the confidence intervals and equivalent of a Bayesian p value. The corresponding posterior estimates of the random effects can be are shown in Table 5.


Table 4: Coefficients for the MCMC model with confidence intervals and cutpoints.

Table 5: Random effects for the MCMC model.


### **References**


Andrew Murphy, Zorica Puškar & Matías Guzmán Naranjo


14 Gender encoding on hybrid nouns in Bosnian/Croatian/Serbian


Andrew Murphy, Zorica Puškar & Matías Guzmán Naranjo


### **Chapter 15**

## **Extract to unravel: Left branch extraction in Romanian/Serbian code-switching**

### Vanessa Petroj

University of Connecticut

Bošković (2008; 2012) argues that languages with and without articles differ considerably with respect to the structure of the nominal domain (among other differences), leading to a distinction between DP (languages with articles) and NP (article-less) languages. Namely, DP languages are proposed to have a functional layer (DP) above the NP where articles are presumed to be positioned, while lacking definite articles indicates the absence of this functional layer in a language, allowing for bare NPs. This structural difference has semantic and syntactic consequences, one of which is the (im)possibility of left branch extraction (LBE) of adjectives and adjective-like elements out of the nominal domain. Specifically, while LBE is allowed in NP languages, is it disallowed in DP languages (Bošković 2008; 2012). While (dis)allowing LBE is fairly straightforward in languages in isolation, here, I extend this test to mixed DP/NP structures resulting from Romanian/Serbian code-switching (CS). Following the DP/NP language distinction, I consider Romanian to be a DP language, disallowing LBE, and Serbian an NP language, allowing LBE. Consequentially, I apply the LBE of adjectives from internal and external arguments of the verb, with switches at various points in the derivation. I show that LBE is reliable in determining the points where CS occurs, whether we are dealing with an NP or a DP projection, but also in showing that mixing two languages may not necessarily result in a uniform system. In other words, through LBE, the structural flexibility resulting from different points of CS indicates that CS, like LBE, is highly contextual and sensitive to phases and phasal domains.

**Keywords:** left branch extraction, code-switching, Romanian, Serbian

Vanessa Petroj. 2018. Extract to unravel: Left branch extraction in Romanian/Serbian code-switching. In Denisa Lenertová, Roland Meyer, Radek Šimík & Luka Szucsich (eds.), *Advances in formal Slavic linguistics 2016*, 337–355. Berlin: Language Science Press. DOI:10.5281/zenodo.2545535

Vanessa Petroj

### **1 Introduction**

Code-switching (CS) represents the alternation of elements from two languages during a single phrase, clause, or utterance (Poplack 1980; Gonzales Velásquez 1995; MacSwan 1999; Muysken 2000; among others). In this paper, the focus is on the CS in Romanian-Serbian bilinguals from a small, culturally Romanian town in the Republic of Serbia. In this paper, CS constructions, just like constructions belonging to any other natural language, are undergoing tests based on grammaticality judgements of bilingual native speakers. Specifically, here I investigate how relevant CS constructions that contain elements from Romanian (a DP language) and Serbian (an NP language) fare with respect to left branch extraction (LBE) of adjectives out of the traditional noun phrase (TNP).<sup>1</sup> Given that LBE is allowed in NP but not DP languages (Uriagereka 1988; Bošković 2008; 2012), the combination of elements belonging to the two parameter settings (DP/NP) has consequences on the (im)possibility of LBE in CS. More importantly, I show that LBE is a reliable test to (i) identify which parameter setting prevails in certain environments, (ii) identify points of CS, and (iii) show that CS, like LBE, is contextual and it depends on the elements that participate in the switch during a spell-out domain.

The paper is organized as follows. §2 provides the demographics, methods, and type of data used for this study. In §3, basic assumptions and relevant LBE background are introduced. §4 gives the background of the relevant CS construction to introduces the main questions addressed in this paper, and §5 investigates LBE in CS. Finally, §6 concludes the paper and offers future research directions.

For ease of exposition, I will follow the common practice of marking elements from the two languages uniformly throughout the paper; in CS examples, **Romanian** elements will be in **bold**, and *Serbian* in *italics*.

### **2 Data and methods**

Data for this study was gathered in the course of several years. Examples found in this paper are extracted from speech produced by Romanian-Serbian bilingual speakers from a culturally Romanian town called Uzdin, in Vojvodina, Serbia.The

<sup>1</sup>The term traditional noun phrase covers both NP and DP, whichever applies in a given language, assuming the so called DP/NP parameter. Under the particular approach of Bošković (2014), the TNP in languages with articles is DP, and in article-less languages it is NP. TNP is generally considered a phase, consequently, DP is a phase in DP languages, while NP is a phase in NP languages.

### 15 Extract to unravel: LBE in Romanian/Serbian code-switching

methods of data gathering include interviews targeting spontaneous production, elicitation, and grammaticality judgements.<sup>2</sup>

Uzdin is one of the several towns in Serbia where the Romanian language, culture, and customs have been highly preserved and nurtured. The author has interviewed 8 subjects, with the age mean of 27. All subjects have at least a college degree, and have attended K-8 grades in Romanian, and high school and college in Serbian. This Romanian community is highly bilingual with a lot of code-switching occurring on a daily basis.

### **3 Relevant background**

### **3.1 General assumptions**

There are two underlying assumptions in this paper. The first is broad, referring to the approach and analysis of CS constructions. As argued by some authors (Gonzales Velásquez 1995; Bhatia & Ritchie 1996; den Dikken 2011; Bandi-Rao & den Dikken 2014), I do not assume CS to impose restrictions that apply to CS constructions alone. Rather, given that participating languages are natural languages that adhere to UG principles, I treat CS in the same way. The second assumption is specific, concerning the language pair in question. Following Bošković (2008; 2012), I consider Romanian and Serbian to differ with respect to whether they have or lack definite articles, consequently, whether they have or lack the DP layer.

### **3.2 DP/NP languages and left branch extraction**

According to Bošković (2008; 2012), languages with and without articles differ in a systematic way. Empirically, having the NP or the DP parameter setting set has shown to have consequences not only on the structure of the TNP, but on a number of different syntactic and semantic phenomena, as well. This has allowed for the investigation of numerous crosslinguistic differences and similarities on a structural level. Bošković (2008; 2012) presents a number of generalizations that group languages based on the presence of absence of definite articles. The one relevant for current purposes is given in (1):

(1) Only languages without articles may allow left branch extraction.

<sup>2</sup> For a more detailed overview discussing the subjects, data, and methods, I refer the reader to Petroj (in prep).

### Vanessa Petroj

While I will only focus on the generalization in (1), I refer the reader to Bošković (2008; 2012) for a comprehensive list of generalization with discussions.

As stated, one of the tests used to capture the crosslinguistic asymmetry between DP and NP languages is LBE of adjectives and adjective-like elements out of the TNP, with the generalization that LBE may only be allowed in NP languages Bošković (2008; 2012). Starting with the Slavic language family, only Bulgarian and Macedonian disallow LBE, and these are the only two languages that have (definite) articles. In Romance, the only language that allows LBE is Latin, and this is also the only Romance language that lacks articles. A very important example that contributes to the LBE generalization is the case of Finnish, discussed in Franks (2007). Namely, Finnish is an article-less language and it allows LBE. Interestingly, as articles started to develop in colloquial Finnish, LBE constructions immediately became very marginal and unacceptable. We see a similar case of variation among a single language in Ancient Greek, where the languages belonging to two different periods pattern differently with respect to the presence of articles, and, therefore, to LBE as well. Koine Greek has articles and disallows LBE, while LBE was used productively in Homeric Greek – which lacks articles. There are a few more languages that allow LBE, and these are: Mohawk, Southern Tiwa, Gunwinjguan (Baker 1996), Hindi, Bangla, Angika, and Magahi. These are all article-less languages. <sup>3</sup>

Moving on to concrete examples, while LBE is disallowed in English (a DP language) (2), and in Spanish (a DP language) (3), it is allowed in Serbian (an NP language) (4):<sup>4</sup>


	- b. \* Profesionales<sup>1</sup> professional.pl.f ofrecía used.to.offer.1pl [DP traducciones translations.pl.f t1]. 'I used to offer professional translations.' (Spanish, Riqueros 2013)

<sup>3</sup>There is an additional requirement for a language to allow LBE – and this is agreement between the noun and the adjective. This, in turn, answers the question of why Chinese, that has very poor agreement morphology, disallows LBE even though it lacks articles. I will not be concerned with this requirement in this paper.

<sup>4</sup>Note that LBE is not possible with non-agreeing adjectives in Serbian (see Bošković 2013).

### 15 Extract to unravel: LBE in Romanian/Serbian code-switching

As predicted, English (a DP language) disallows, while Serbian (an NP language), allows LBE. To account for the contrast from above, Bošković (2013; 2014) proposes a contextual approach to phases in which the highest phrase in the extended domain of a lexical head acts as a phase. NP and DP languages then differ with respect to the phasal boundaries. Specifically, NP is a phase in NP languages, while DP is a phase in DP languages. Furthermore, assuming that the edge of each phase is visible to the next phase (Chomsky 2001), i.e., it can be available for extraction and movement, the adjective then occupies significantly different positions relative to the phasal edge in NP and DP languages. This is illustrated in (5), where the adjective is at the edge the TNP phase in NP languages (5a) and extraction of the adjective is allowed, versus DP languages in (5b), where DP is the phase, and the adjective is not at the edge of the TNP phase (the TNP being DP in this case). In order to be available for movement, the adjective has to move to DP due to the Phrase Impenetrability Condition (PIC) (Chomsky 2001), but the movement is blocked by antilocality, which requires the AP movement to cross a full phrase. In the case of (5b), AP does not cross a full phrase, only a segment.

When this is applied to Romanian and Serbian, the outcome is clear. Serbian (NP) allows LBE as in (6), and Romanian (DP) disallows it, as in (7):


### Vanessa Petroj

(7) a. **Am** have.aux.1sg **văzut** seen.ptcp {**scumpe** expensive.pl.f / **scumpe-le**} expensive-the.pl.f **automobile**. cars.pl.f 'I saw {expensive / the expensive} cars.' (Petroj in prep) b. \* {**Scumpe**<sup>1</sup> expensive.pl.f / **Scumpe-le**<sup>1</sup> } expensive-the.pl.f **am** have.aux.1sg **văzut** seen.ptcp [DP t<sup>1</sup> **automobile**]. cars.pl.f Intended: 'I saw {expensive / the expensive} cars.' (Petroj in prep)

Structurally, this looks as follows: In Serbian, the LBE of adjectives (located in SpecNP) takes place through one movement out of the NP, as in (8a). In Romanian, however, a more complex movement is required. First, in order for the adjective to reach SpecDP, the AP (that has previously merged with D<sup>0</sup> through Affix Hopping) has to proceed through SpecDP, which is the edge of the phase; only then would it be visible for further movement. The first movement, however, is blocked, by antilocality.<sup>5</sup> This is illustrated in (8b).<sup>6</sup>

While affairs are clear in Romanian and Serbian in isolation, the mixed parameter settings in Romanian/Serbian CS poses an important question with respect to

<sup>5</sup>There are accounts where Romanian APs move to SpecDP (this is why they can precede the article, see Abney 1987; Dobrovie-Sorin 1993; Ungureanu 2006; a.o.). These accounts face a problem: if movement to SpecDP is possible, APs should be allowed to move out of DPs, too.

<sup>6</sup> For the complete analysis of definite article being hosted by the noun or the adjective, I refer the reader to Petroj (in prep).

### 15 Extract to unravel: LBE in Romanian/Serbian code-switching

which setting prevails in the relevant CS constructions; DP or NP. To address these issues, I will examine LBE of adjectives in CS, starting with simple transitive constructions. However, before testing LBE, the next section offers facts about elements participating in the CS TNP that are relevant in understanding the LBE of adjectives in CS.

### **4 Relevant code-switching background**

As mentioned, Romanian and Serbian differ with respect to the DP/NP parameter setting – Romanian being a DP (having articles) and Serbian an NP language (lacking articles).


Following Bošković (2008; 2012) and the numerous generalizations that group languages according to the DP/NP parameter, Romanian and Serbian bring two clashing constructions and parameter settings interacting into combined structures. Although CS occurs on various levels (cf. Petroj in prep), the relevant construction is represented in (10):

(10) *teški* difficult.lf.sg.m *ispit*-**ul** exam.sg.m-the.sg.m 'the difficult exam'

In this construction, the elements that participate in CS are the Romanian definite article **-ul**, the Serbian noun *ispit,* and the Serbian adjective *teški*. The counterparts of Romanian and Serbian constructions are illustrated below in (11a) and (11b) respectively:

	- b. *teški* difficult.lf.sg.m *ispit* exam.sg.m 'the difficult exam'

### Vanessa Petroj

Being either an NP or a DP language has additional consequences. In this case, it means different ways in which a language can express definiteness. Specifically, while Romanian expresses definiteness through definite articles on nouns (12a) or adjectives (12b), Serbian has an alternative way of obtaining definite versus indefinite interpretation. As illustrated in Table 1, Serbian has two lexical forms for adjectives: short form (sf) and long form (lf). These two forms are considered by some authors (Aljović 2002; Despić 2011; Talić 2014) to correspond to definite/specific (13a) and indefinite/non-specific (13b) interpretations, respectively.<sup>7</sup>

Table 1: Serbian shoft form vs. long form adjectives


	- b. **greu-l** difficult.sg.m-the.sg.m **examen** exam.sg.m 'the difficult exam'
	- b. *težak* difficult.sf.sg.m *ispit* exam.sg.m 'a difficult exam'

What is most striking about the constructions like (10) is the combination of elements that is not found in either of the participating languages.<sup>8</sup> In other words,

<sup>7</sup> For current purposes, I will simplify matters a bit and will consider the long vs. short form contrast to impose a definite vs. indefinite NP interpretation, respectively. For relevant discussion, see Aljović (2002); Despić (2011); Talić (2014); Stanković (2015); a.o.

<sup>8</sup> For a comprehensive analysis and account of the CS TNP and the interaction of Romanian definite articles, Serbian nouns, and Serbian adjectives, I refer the reader to Petroj (in prep).

### 15 Extract to unravel: LBE in Romanian/Serbian code-switching

the resulting structure is a combination of two definiteness-related elements – a Romanian definite article and a Serbian long-form (definiteness-imposing) adjective – in one TNP. Although coming from languages with different architectures, the elements form a cohesive and productive mixed structure. Given that both languages can express definiteness separately and that both definite elements are allowed in a single construction raises the question about the underlying structure of cases like (10). Specifically, does the resulting construction have the DP layer like in Romanian, or is it an NP construction like in Serbian?

Although having the definite article in the structure should indicate the presence of the DP layer, the fact that CS represents a mixture of (in this case) two parameter settings does not necessarily point towards the dominance of either one of the participating languages. On the one hand, the presence of the definite article may indicate that there is, in fact, a DP layer in (10), and that **-ul** is positioned in D<sup>0</sup> . One the other, given that all three elements (D, N, and A) undergo agreement in CS (Petroj in prep), the definiteness may be licensed by the Serbian long-form adjective, and the DP layer may not exist.<sup>9</sup> One way to confirm that the DP layer indeed exists in this type of construction is by turning to the contextual approach to phases. Recall that this approach says that any phrase can be a phase, as long as it is the highest in its domain. As seen above, the edge of the phase is available for further actions, while the rest of the construction is frozen inside the phase. That being said, there are two possibilities regarding the status of the CS TNP: (i) if there is no DP and the highest phrase in the TNP domain is NP, the adjective is in SpecNP and it should be extractable, allowing for the possibility of LBE; (ii) if there is a DP layer present, i.e. DP is a phase, the adjective being in SpecNP would make it too deeply embedded for extraction (only SpecDP being visible as the edge of the phase); LBE, in this case, will not be allowed.

To test this, the next session focuses on the LBE from the CS TNP from internal and external arguments respectively.

### **5 Left branch extraction in code-switching**

### **5.1 Left branch extraction in Romanian and Serbian**

As LBE is a reliable test for identifying the DP/NP parameter setting of a natural language, the same test is applied to CS constructions that include structures like (10), repeated below as (14).

<sup>9</sup>By agreement, I refer to the forms that the adjective and the article take relative to the gender of the noun.

Vanessa Petroj

(14) *teški* difficult.lf.sg.m *ispit*-**ul** exam.sg.m-the.sg.m 'the difficult exam'

Recall that as predicted by the generalizations in Bošković (2008), Romanian, being a DP language, disallows LBE and Serbian, an NP language, allows it. This is illustrated in (6) for Serbian and in (7) for Romanian, repeated below as (15) and (16), respectively:


As seen above, facts are clear for Romanian and Serbian in isolation. In the remainder of this section, LBE of adjectives will be applied to CS TNPs from transitive constructions and from the subject.

### **5.2 Transitive constructions**

The paradigm below starts with (17), in which CS occurs within a TNP where the verb is Romanian, the definite article is Romanian, and the noun and the adjective are Serbian. As illustrated in (17b), LBE out of this TNP is disallowed. In (18), the verb is still Romanian, but even a fully Serbian TNP fails the LBE test. Interestingly, when the Romanian verb is replaced by its Serbian counterpart in (19), LBE improves drastically. Interestingly, while the Serbian verb *can* take a DP

### 15 Extract to unravel: LBE in Romanian/Serbian code-switching

complement in (20a), extraction of the adjective is blocked in (20b), confirming that **-ul** may indeed point towards the existence of the DP layer.<sup>10</sup>

(17) a. **Am** have.aux.1sg **trecut** passed.ptcp *teški* difficult.lf.sg.m *ispit-***ul**. exam.sg.m-the.sg.m 'I passed the difficult exam.' b. \* *Teški*<sup>1</sup> difficult.lf.sg.m **am** have.aux.1sg **trecut** passed.ptcp [t1 *ispit***-ul**]. exam.sg.m-the.sg.m 'I passed the difficult exam.' (18) a. **Am** have.aux.1sg **trecut** passed.ptcp *teški* difficult.lf.sg.m *ispit*. exam.sg.m 'I passed the difficult exam.' b. \* *Teški*<sup>1</sup> difficult.lf.sg.m **am** have.aux.1sg **trecut** passed.ptcp [t1 *ispit*] exam.sg.m Intended: 'I passed the difficult exam.' (19) a. **Am** have.aux.1sg *položila* passed.sg.f *teški* difficult.lf.sg.m *ispit*. exam.sg.m 'I passed the difficult exam.' b. ? *Teški*<sup>1</sup> difficult.lf.sg.m **am** have.aux.1sg *položila* passed.sg.f [t1 *ispit*]. exam.sg.m 'I passed the difficult exam.' (20) a. **Am** have.aux.1sg *položila* passed.sg.f *teški* difficult.lf.sg.m *ispit***-ul**. exam.sg.m-the.sg.m 'I passed the difficult exam.' b. \* *Teški*<sup>1</sup> difficult.lf.sg.m **am** have.aux.1sg *položila* passed.sg.f [t1 *ispit***-ul**]. exam.sg.m-the.sg.m Intended: 'I passed the difficult exam.'

Based on the above discussion, I take (dis)allowing LBE to indicate the presence or absence of the DP layer. The ungrammaticality of (18b) and (20b) then indicates that any Romanian element in the VP domain forces DP-hood on the object. What is particularly interesting here is that although the entire TNP is in Serbian, LBE still cannot take place. This suggests that although no Romanian D element is present overtly, there is still a DP projection here, which is not the case in (19),

<sup>10</sup>I would like to thank an anonymous reviewer for noticing the incomplete paradigm and pointing out the relevance of the example in (19).

### Vanessa Petroj

where LBE improves drastically with a Serbian verb introduced in the structure. Additionally, the paradigm in (17)–(20) confirms that regardless of the verb being Romanian or Serbian, the presence of a Romanian element in the object position will always have the DP layer.

Given that both Romanian and Serbian verbs can occur and take either a Romanian or a Serbian complement in CS, data from above indicates that Romanian verbs must take a DP complement even in CS as in (21a), while a Serbian verb can take either an NP complement as in (19b), or a DP complement, as in (21b).


We then have the generalization in (22):<sup>11</sup>

(22) Romanian verbs must take a DP complement, while Serbian verbs can take either a DP or an NP complement.

I will now test the LBE of adjectives out of a ditransitive construction. Examples in (23) and (25) represent fully Serbian sentences with the LBE of the possessor out of the indirect object (IO) in (23b) and direct object (DO) in (25b). As expected, Serbian being an NP language, LBE is allowed in both cases. In contrast, when a Romanian object is introduced into the structure in (24) and (26), LBE out of the Serbian object in (24b) and (26b) leads to ungrammaticality.<sup>12</sup>

(i) Lo him.cl.acc.m vimos saw.1pl a a Juan. Juan 'We saw John.' (Spanish; Jaeggli 1986)

<sup>11</sup>The pattern of certain elements allowing DP or NP arguments seems to extend beyond the VP domain, specifically, with respect to CS of conjuncts and coordinated structures. I refer the reader to Petroj (in prep) for more examples and more detailed explanation.

<sup>12</sup>**Pe** in (24) is a dummy preposition assigning the accusative to its complement. It is comparable to the Spanish *a,* illustrated in (i).

	- b. *Svom*<sup>1</sup> her.poss.refl.dat *moja* my.nom *drugarica* friend.nom *predstavlja* introduce.3sg [NP t<sup>1</sup> *prijatelju*] friend.dat [NP *Jovana*]. Jovan.acc 'My friend introduces Jovan to her friend.'
	- b. \* *Svom*<sup>1</sup> her.poss.refl.dat *moja* my.nom *drugarica* friend.nom *predstavlja* introduce.3sg [NP t<sup>1</sup> *prijatelju*] friend.dat [DP **pe** pe **Jovan**]. Jovan Intended: 'My friend introduces Jovan to her friend.'
	- her.poss.refl my.nom friend.nom send.3sg book.acc [NP *mom* my.dat *bratu*]. brother.dat 'My friend sends her book to my brother.'

### Vanessa Petroj

b. \* *Svoju*<sup>1</sup> her.poss.refl.acc *moja* my.nom *drugarica* friend.nom *šalje* send.3sg [NP 1 *knjigu*] book.acc [DP **fratelui** brother.dat **meu**]. my Intended: 'My friend sends her book to my brother.'

(24) and (26) show that when one object is in Romanian and the other in Serbian, LBE is not allowed even when the LBE is attempted out of the TNP that contains Serbian elements only. This is especially interesting since LBE was allowed once a Serbian verb was introduced into the structure in (19). (24) and (26) indicate that any Romanian element (not just the verb) in the *v*P/VP domain blocks LBE. With respect to the DP/NP status, it seems like both objects are DPs when one object is in Romanian. These examples then indicate that no structural mixing regarding the categorical status is allowed between the objects in a double object constructions (where one object would be an NP and one object a DP); if one object is a DP, both must be DPs. Consequently, if *v*P is considered a phase, the following generalization can be made:<sup>13</sup>

(27) No mixing of the categorical status of the TNP within a spell-out domain, where the spell-out domain is a phasal complement.

<sup>13</sup>An anonymous reviewer pointed out an interesting question about the generalization in (27), namely, that having a Romanian low/VP-adjunct after a Serbian ditransitive construction like the one in (i) might reveal additional (counter)evidence for the structure of mixing within the spell-out domain. While the sentence in (i.b) is only marginally acceptable, the subjects reported challenges in processing the sentence, rather than in grammaticality, which can be assigned to prosodic factors of a fully Serbian LBE construction. I leave CS of adjuncts for future research.


15 Extract to unravel: LBE in Romanian/Serbian code-switching

### **5.3 Subject**

Given that having a Romanian element in either IO or DO blocks LBE from the other object (even when the other object is entirely in Serbian) it is important to test the extent of influence of the Romanian DP on the rest of the structure.

In the examples below, (28) represents a fully-Serbian example, with the possessor being extracted from the subject in (28b). This being a fully Serbian construction, LBE is allowed.

(28) a. *Tvrdiš* claim.2sg *da* that *moja* my.nom *drugarica* friend.nom *predstavlja* introduce.3sg *Petru* Petar.dat *Jovana*. Jovan.acc 'You claim that my friend introduces Jovan to Petar.' b. *Moja*<sup>1</sup> my.nom *tvrdiš* claim.2sg *da* that [NP t<sup>1</sup> *drugarica*] friend.nom *predstavlja* introduce.3sg [NP *Petru*]

Petar.dat [NP *Jovana*]. Jovan.acc 'You claim that my friend introduces Jovan to Petar.'

Interestingly, when a Romanian element is introduced as the DO in (29) and as the IO in (30), LBE out of a fully-Serbian Subject is permitted in both cases, as in (29b) and (30b).

	- b. *Moja*<sup>1</sup> my.nom *tvrdiš* claim.2sg *da* that [NP t<sup>1</sup> *drugarica*] friend.nom *predstavlja* introduce.3sg [NP *Petru*] Petar.dat [DP **pe** pe **Jovan**] Jovan 'You claim that my friend introduces Jovan to her friend.'

### Vanessa Petroj

b. *Moja*<sup>1</sup> my.nom *tvrdiš* claim.2sg *da* that [NP t<sup>1</sup> *drugarica*] friend.nom *šalje* send.3sg [NP *svoju* her.poss.refl.acc *knjigu*] book.acc [DP **fratelui** brother.dat **meu**]. my 'You claim that my friend sends her book to my brother.'

These data contrast with (24) and (26) where the introduction of a Romanian internal argument blocked LBE out of the other internal argument. In contrast, LBE out of the subject is not affected by CS in the internal arguments of the verb. Based on these examples, the following generalizations can be made:


Notice also that a Romanian external DP argument does not force DP-hood on a Serbian internal argument, as indicated by the possibility of LBE in (33):

(33) a. **Elev-ul** student.sg.m-the.sg.m **a** have.aux.3sg *položio* passed.sg.m *teški* difficult.lf.sg.m *ispit*. exam.sg.m 'The student passed the difficult exam.' b. ? *Teški*<sup>1</sup> difficultlf.sg.m **elev-ul** student.sg.m-the.sg.m **a** have.aux.3sg *položio* passed.sg.m

[NP t<sup>1</sup> *ispit*]. exam.sg.m 'The student passed the difficult exam.'

### **6 Conclusions and further research**

Due to the DP/NP difference between Romanian and Serbian, LBE has proven reliable in determining the points where CS may occur, but also in showing that mixing two languages may not necessarily result in a homogenous DP or NP system. In other words, this variant of CS shows flexibility when it comes to elements that are switched, but also regarding what parameter setting will prevail depending on when CS occurs in the derivation. When it comes to the interaction between Romanian and Serbian elements, the following generalizations hold:

### 15 Extract to unravel: LBE in Romanian/Serbian code-switching


Importantly, LBE has also shed light on the flexibility of the CS construction to navigate through parameters.

3. No mixing of the categorical status of the TNP is allowed within a spell-out domain, where the spell-out domain is a phasal complement.

We can assume then that the *v*p/VP spell-out domain may look something like (34), whereby CS below the *v*P-level affects the entire phasal domain, but not the area above it:

Finally, more research needs to be done to correctly predict the points of CS in other langauges with different spell-out domains/phasal boundaries in order to unravel the rules and constraints, and identify the exact points of CS.

### **Abbreviations**


### Vanessa Petroj

### **Acknowledgements**

For helpful comments, I am thankful to Željko Bošković, as well as the audience and the three anonymous reviewers of FDSL 12. I am also grateful to Neda Todorović for constructive conversations. For judgements, I would like to thank Teodora Fizešan, Daniel Neda, Xenia Oalge, and Kristina Georgijev.

### **References**


### 15 Extract to unravel: Left branch extraction in Romanian/Serbian code-switching


### **Chapter 16**

## **Unifying structural and lexical case assignment in Dependent Case Theory**

### Zorica Puškar

Leibniz-Zentrum Allgemeine Sprachwissenschaft, Berlin

### Gereon Müller

University of Leipzig

Dependent Case Theory argues against case assignment via a functional head (cf. Chomsky 2000; 2001) and proposes instead that case is a result of a structural relation between two DPs (Marantz 1991; McFadden 2004; Baker & Vinokurova 2010; Baker 2015). However, Dependent Case Theory cannot completely abandon case assignment via a syntactic head, as this mechanism accounts for lexical case (e.g. lexical dative). Furthermore, structural and lexical datives are morphologically identical and often behave similarly, and 'just where the line should be drawn between the two is a theoretical matter' (Baker 2015: 13). We argue for a unified approach to lexical and structural dative case assignment under Dependent Case Theory, implemented in a derivational fashion, via the operation Agree. While structural dat is assigned as a high dependent case in the VP in the presence of a lower (later acc) DP, lexical dat is assigned in the same configuration, in the VP, in the presence of another silent or overt co-argument DP.

**Keywords:** dependent case, Agree, dative

### **1 Introduction: Dependent Case Theory**

The Dependent Case Theory (henceforth DCT) is a result of the work of (Marantz 1991; McFadden 2004; Baker & Vinokurova 2010; Baker 2012; 2015), among others, adopting similar ideas by Yip et al. (1987); Bittner & Hale (1996); Kiparsky (1992; 2001); Wunderlich (1997); Stiebels (2002). Case assignment in DCT relies

### Zorica Puškar & Gereon Müller

primarily on Marantz's (1991: 24) disjunctive case hierarchy, which distinguishes between the following types of case:

(1) Lexically governed case ≻ Dependent case (accusative and ergative) ≻ Unmarked case (nominative and absolutive) ≻ Default case

There are several steps in the case assigning process. In Step 1 all DPs selected by lexical items (verbs, prepositions, etc.) which idiosyncratically assign a particular case, receive the lexically governed case value from the designated head upon c-selection. In Step 2, pairs of remaining caseless DPs are inspected in their local domains. Dependent case is assigned to them according to (a variation of) the following case assignment rules:

	- a. If there are two distinct DPs in the same spell out domain such that DP<sup>1</sup> c-commands DP2, then value the case feature of DP<sup>2</sup> as accusative unless DP<sup>1</sup> has already been marked for case (3).
	- b. If there are two distinct DPs in the same spell out domain such that DP<sup>1</sup> c-commands DP2, then value the case feature of DP<sup>1</sup> as ergative unless DP<sup>2</sup> has already been marked for case (4).

These rules lead to a four-way typology of case alignments (Levin & Preminger 2015): The application of only the rule (2a) will lead to nominative-accusative alignment (3), while (2b) will yield ergative-absolutive alignment (4). If both parameters are simultaneously present in the same language, this would yield tripartite case systems (e.g. Nez Perce, where accusative and ergative can co-occur, see Baker 2015) and if both parameters are switched off, the language has neither ergative nor accusative case marking.

### 16 Unifying structural and lexical case assignment in Dependent Case Theory

In Step 3, the remaining DPs that have not received case by means of competition with another DP, receive the unmarked case, which depends on the local domain in which the NP is found (nominative/absolutive in TP/CP, genitive in DP). Finally, default case is assigned to fragment answers and free-standing DPs (*Who bought the bread? Him./\*He.*).

One of the evident problems for DCT is that dat can be assigned either in Step 1, as lexically governed case, or in Step 2, as dependent case. If assigned as dependent case, dat is considered to be assigned to a higher DP in the VP (Baker & Vinokurova 2010; Baker 2015), which means that the case feature on a dative DP can sometimes be supplied by a lexical head and sometimes in a particular configuration in the VP and even though this feature has two completely different sources in the syntax, it is still recognised and realised as the same exponent by the morphology. We propose instead that assignment of dative via a lexical head can be abandoned in DCT. We claim that dat can always be treated as dependent case assigned to a higher DP in a VP. In line with proposals by Bittner & Hale (1996); Baker (2015) (for case assignment in general), Wood (2017) (for lexical accusative case in Icelandic), and Baker & Bobaljik (2017) (for inherent ergative case), instead of assuming that a verb comes with a lexical [∗dat∗] case feature (5), we propose that the verb comes with a covert pseudo co-argument DP, which enables the assignment of lexical dative as dependent case to a higher DP in a VP (6).

Furthermore, there is an ongoing debate within the DCT on the timing of case assignment. While some authors want case assignment to be a syntactic process (see Preminger 2014 and Baker 2015, who times case assignment at Spell-Out, during linearization), others argue that dependent case is assigned at PF (Marantz 1991; McFadden 2004; Bobaljik 2008). In what follows, we will take the syntactic side of the debate and offer a derivational implementation via the operation Agree between two DPs, which will derive dependent case assignment as a narrow syntactic process, and explain the dative puzzle outlined above.

Zorica Puškar & Gereon Müller

### **2 Structural dative in Serbian**

In order to derive the assignment of structural dative case in double object constructions, this section offers a short empirical introduction on the structural relations between nom, acc and dat arguments in BCS. The order of the indirect object (IO) and the direct object (DO) is mostly free in Serbian and both orders can be used in neutral contexts:


However, there is reason to believe that IO ≻ DO, i.e. (7a) is the base order of the two objects, while (7b) is derived by A-movement. The evidence from quantifier scope (Aoun & Li 1989; Frey 1989; Bruening 2001) shows that, while in the V ≻ dat ≻ acc order only the reading where the quantifier in the IO scopes over the one in the DO is available (8a), the order V ≻ acc ≻ dat allows for both readings (8b). The availability of the reading where the existential quantifier outscopes the universal one in (8b) indicates that the DO can reconstruct in its base position, below the IO.


Furthermore, maximal focus projection (from a focused NP to the entire clause) is possible only if we maintain the base word order (Höhle 2018; von Stechow & Uhmann 1986; Haider 1992). A sentence in which movement has occurred should not be a good answer to the question *What happened?/What's new?*. <sup>1</sup> With the

<sup>1</sup>Stjepanović (1999: 76) offers a similar argument for Serbo-Croatian.

### 16 Unifying structural and lexical case assignment in Dependent Case Theory

focus on the DO, if the whole sentence is new information, focus is perceived as neutral if the sentence has the canonical word order (9a). However, the focus in (9b) is not necessarily new information focus, as it does not project to the entire clause; it can be interpreted as contrastive, which indicates that the order is not the base one and movement has taken place.<sup>2</sup>

	- a. [Slavica Slavica je is poslala sent Marku Marko.dat pismo letter.acc ] 'Slavica sent a letter to Marko.'
	- b. # [Slavica Slavica je is poslala sent pismo letter.acc Marku Marko.dat ] 'Slavica sent a letter to Marko.' / 'It was Marko who Slavica sent a letter to.'

Finally, the order of object clitics in Serbian is always dat ≻ acc, regardless of the order IO and DO noun phrases. Stjepanović (1999) and Bošković (2001) assume that clitics move outside of their VP into Agr projections. The strict hierarchy between them suggests that this movement respects superiority.

	- b. Ti you si are **mu** him.dat **ga** it.acc poslala. sent 'You sent it to him.
	- c. \* Ti you si are **ga** it.acc **mu** him.dat poslala. sent 'You sent it to him.
	- b. \* Ti you si are **ga** it.acc **mu** him.dat poslala. sent 'You sent it to him.

<sup>2</sup>Even though the word order in (9b) is neutral, as noted in (7b), if the dat argument is focused, the sentence sounds less neutral than its counterpart in (9a). We thank an anonymous reviewer for this insight. Moreover, factors such as animacy and givenness may contribute to enabling other orders in neutral contexts; see recent findings by Titov (2017) for Russian and Velnić (2017) for Croatian.

Zorica Puškar & Gereon Müller

> c. Ti you si are **mu** him.dat **ga** it.acc poslala. sent 'You sent it to him.

We conclude from these tests that the base word order of objects in Serbian is IO ≻ DO.

### **3 A derivational account of dependent case assignment**

Following Baker & Vinokurova (2010); Baker (2015); Preminger (2014); Levin & Preminger (2015), we assume that case is assigned in narrow syntax. We adopt case feature notations from Lexical Decomposition Grammar, following Kiparsky (1992; 2001); Joppen & Wunderlich (1995); Wunderlich (1997); Stiebels (2002):

	- b. dat: [+hr +lr] 'there is a higher role and there is a lower role'
	- c. erg: [+lr] 'there is a lower role'
	- d. nom/abs: [ ] no case features

The features [+hr] and [+lr] are assigned in the course of the derivation to argument DPs via the operation Agree. We assume that both standard 'downward' Agree and 'upward' Agree (see Chomsky 1986; 1991; Kayne 1989; Pollock 1989; Koopman 2006) are possible options in the grammar (see also Abels 2012: 92f. as well as Baker's 2008: 155 *Direction of Agreement Parameter*). We propose that Agree applies between two DPs in a c-command relationship. When Downward Agree (↓Agr↓) applies, the higher of the two DPs in an asymmetric c-command relation probes down and receives the [+lr] from the lower one (see (13) below), and by Upward Agree (↑Agr↑), the lower DP probes upward and receives its [+hr] case feature from the higher DP (see (14) below). An important principle is that case valuation cannot take place if the *goal DP* already has a valued case feature (Bittner & Hale 1996; Baker 2015). One DP can participate in multiple Agree operations as a *probe* and, in principle, this can result in a DP receiving more than one case feature, as demonstrated shortly below (15). Moreover, in a nom/acc system, ↓Agr↓ always precedes ↑Agr↑. Finally, in a nominative-accusative alignment, assignment of [+lr] in Spec*v*P must somehow be pre-empted, otherwise the DP would receive ergative case. We assume that languages with nominativeaccusative alignment have an *ergative switch-off parameter*, regulated by the following principle: In a nom-acc language the higher DP in a *v*P cannot be case-

### 16 Unifying structural and lexical case assignment in Dependent Case Theory

valued.<sup>3</sup> Finally, we assume that the domain in which the proposed operations apply is the TP.

Let us apply the system to dative case assignment. In a double-object construction, a verb selects two objects, yielding thereby a VP with two unmarked DPs in a c-command relationship. Since this is a nom-acc system, ↓Agr↓ will always precede ↑Agr↑. Thus when ↓Agr↓ applies, the higher of the two DPs receives a [+lr] feature from the lower one. Consequently, ↑Agr↑ does not apply because the potential goal is already case-valued.

## (13) **Assignment of [+lr] in VP** VP

After the external DP<sup>3</sup> is introduced in Spec*v*P, we now have three DPs in the same domain. The remaining two caseless DPs are DP<sup>1</sup> and DP3. When ↓Agr↓ applies between the highest DP<sup>3</sup> in the Spec*v*P and the lowest DP<sup>1</sup> , no case valuation obtains, due to the ergative switch-off parameter, which demands that a DP in Spec*v*P cannot be case valued. ↑Agr↑ thus applies afterwards, whereby the lower DP receives the [+hr] feature from the higher one (14).

### (14) **Assignment of [+hr] to the lower argument in VP**

<sup>3</sup>Alternatively, assuming that at the *<sup>v</sup>*P level <sup>↑</sup>Agr<sup>↑</sup> precedes <sup>↓</sup>Agr<sup>↓</sup> yields the same results.

However, DP<sup>2</sup> and DP<sup>3</sup> still fulfil the criteria for case assignment to apply, since they are in a c-command relationship, and the higher one is not marked for case (15). Thus ↑Agr↑ applies, providing the lower DP<sup>2</sup> with a [+hr] feature (and the [[+hr], [+lr]] bundle is realised as dative).

(15) **Assignment of [+hr] to higher argument in VP**

This implementation derives the assignment of dependent case by means of existing, independently motivated mechanisms, in a derivational manner. An interesting prediction is that at the point in the derivation before the external argument

### 16 Unifying structural and lexical case assignment in Dependent Case Theory

is merged, dative should behave in a similar way as ergative case, as it only bears a [+lr] feature, as in (13). While we leave this point for further research, note that similarities between datives and ergatives have been reported in Basque by Arregi & Nevins (2012), in Indo-Aryan languages by Butt (2006) and even Serbo-Croatian by Progovac (2013). Another important prediction is that movement of the DO should not affect acc case assignment, since [+hr] feature still has the necessary configuration even after movement, as shown by (16). In this process, DP<sup>2</sup> is first assigned the [+lr] feature by ↓Agr↓ with DP<sup>1</sup> , which is then moved, and still caseless. After DP<sup>3</sup> has been introduced, both DP<sup>1</sup> and DP<sup>2</sup> will receive their missing [+hr] features by ↑Agr↑ with it.

### **4 Lexical dative**

### **4.1 Similarities between structural and lexical dative**

As noted in the introduction, the central claim of this paper is that lexical dative case is assigned just like the structural dative. In order to support this claim, we first demonstrate that there are indeed similarities between 'structural' and 'lexical' datives in their syntactic behaviour.

For instance, they act in a similar way in passivisation. In double-object constructions, only the accusative object can be passivised, i.e. only the theme argument can alternate between accusative and nominative, as in (17).

	- b. Knjiga book.nom.sg.f je is bila been.sg.f data given.sg.f Milošu. Miloš.dat 'The book was given to Miloš.'
	- c. Milošu Miloš.dat je is bila been.sg.f data given.sg.f knjiga. book.nom.sg.f 'The book was given to Miloš.'

The dative argument, however, cannot be turned into a subject and it never alternates (18).

(18) a. \* Miloš Miloš.nom je is bio been.sg.m dat given.sg.m knjigu. book.acc 'Miloš was given a book.'

Zorica Puškar & Gereon Müller

> b. \* Milošu Miloš.dat je is {bio been.sg.m / bilo} been.sg.n {dat given.sg.m / dato} given.sg.n knjiga. book.nom 'Miloš was given a book.'

Unlike in Icelandic (as described by Zaenen et al. 1985), dative cannot bind a subject oriented anaphor (19a) and it cannot be deleted under subject ellipsis (19b), hence it is not a subject.

	- b. \* Miloš Miloš.nom je is bio been.sg.m izbačen thrown.out.sg.m sa from časa class i and \_\_\_ bio been.sg.m je is dat given.sg.m ukor. reprimand intended: 'Miloš was thrown out of the class and he was reprimanded.'

Parallel to (17) above, some constructions with lexical datives can be pasivised, as in (20), where the lexical dative in (20b) mirrors the structural one from (17c).

	- b. Ani Ana.dat je is bilo been.sg.n pomognuto. helped.sg.n 'Ana was helped.'

However, Zaenen et al. (1985) subjecthood tests also show that this dative does not behave like a subject. It does not bind a subject-oriented anaphor (21a) and it cannot be deleted under subject ellipsis (21b), just like the structural dative in (19).

	- b. \* Ana Ana.nom je is uradila done.sg.f sve all zadatke tasks.acc i and \_\_\_ pri with tome that je is bilo been.sg.n pomognuto. helped.sg.n 'Ana did all the tasks and was helped with that.'

### 16 Unifying structural and lexical case assignment in Dependent Case Theory

Moreover, as argued by Maling (2001) and shown for German by McFadden (2004), one of the structural asymmetries between DOs and IOs is their behaviour in nominalisations. DOs appear in genitive when the VP is nominalised (22b), unlike both structural (22c) and lexical datives (23), which do not alternate with genitive.<sup>4</sup>

### (22) **Structural dative**

	- i. 'the giving of Miloš (to someone)'
	- ii. \* 'the giving (of something) to Miloš'

### (23) **Lexical dative**

	- i. 'the belonging of Ana (to someone)'
	- ii. \* 'the belonging to Ana'

<sup>4</sup>A reviewer wonders about the status of *darivanje Miloša* 'the giving of something to Miloš.gen' in (22c). We believe that here the genitive of the complement of *darivati* is lexical. We leave it to future research to explore how lexical genitive fits into the current proposal.

### Zorica Puškar & Gereon Müller

Finally, as argued for German by Sternefeld (1985); Bayer et al. (2001); McFadden (2004), in the so-called 'topic drop' constructions, it is possible to omit the acc (25a), but not a dat topic, irrespective of whether it is structural (25b) or lexical (25c).

	- b. Da, yes jednom once sam am \*(joj) her.dat poklonila gave cvet. flower 'Yes, I once gave her a flower.' structural dat c. Da, yes jednom once sam am \*(joj) her.dat pomogla. helped 'Yes, I helped her once.' lexical dat

From these similarities, we conclude that lexical and structural datives can be treated as the same type of syntactic objects.<sup>5</sup> In the next sections, we will inspect different types of lexical datives we have identified in Serbian in turn.


<sup>5</sup>An additional language specific test that points into the same direction is Left Branch Extraction, which is allowed out of subjects (i.a) and objects (i.b) in Serbian (see Bošković 2005, and subsequent work), but seems to be disallowed both with structural (i.c) and lexical dative (i.d).

However, the acceptability of the examples varies across different speakers, and it can be also influenced by factors such as word order. We leave this very interesting issue for future research.

### 16 Unifying structural and lexical case assignment in Dependent Case Theory

### **4.2 Lexical dative as dependent case**

### **4.2.1** *Help***-type verbs as underlying ditransitives**

*Help*-type verbs include verbs such as *pomoći* 'help',*čestitati* 'congratulate', *ugoditi* 'please', *služiti* 'serve', *verovati* 'believe', *zavideti* 'envy', *doprineti* 'contribute', etc. (a partial list from several types of monotransitive constructions identified by Stipčević 2014). We argue that these verbs are underlyingly ditransitive, where the DPacc is present, but covert, yet even as such, it serves as a competitor for dative case assignment. In these constructions, the nom argument is usually an agent, while the dat can have beneficiary/maleficiary/recipient/goal/target person theta-role. The unmarked word order of arguments of *help*-type verbs is nom ≻ dat (26).

	- b. Trener coach.nom je is čestitao congratulated svojim poss.dat igračima. players.dat 'The coach congratulated his players.'

A possibly crucial piece of evidence for postulating a silent DPacc is that even though usually monotransitive, these constructions can have another *overt* acc argument:<sup>6</sup>

(27) a. Ljubica Ljubica.nom je is pomogla helped svom poss.dat detetu child.dat školovanje. education.acc 'Ljubica sponsored her child's education.'

	- b. Er he.nom glaubt believes seinem poss.dat Bruder brother.dat die the Geschichte. story.acc 'He believes his brother's story.'

<sup>6</sup>Note a similar kind of behaviour of lexical datives in German invoked by (McFadden 2004: 129). He takes this as a piece of evidence that lexical dative assigned by *glauben*/*helfen*-type verbs in German can be analysed as structural dative.

### Zorica Puškar & Gereon Müller

b. Trener coach.nom je is čestitao congratulated svojim poss.dat igračima players.dat pobedu. victory.acc 'The coach congratulated his players on the victory.'

*Help*-type constructions with lexical datives in Serbian seem to be able to passivise (forming an impersonal passive construction; recall (20)). Such evidence suggests that constructions of this type can be treated as double-object constructions, equivalent to those in (7), allowing for treatment of lexical dative as structural.

We therefore argue that constructions with the *help*-type verbs are in fact double-object constructions. The lower acc object is present as a silent DP (see Wood 2017 for a similar proposal for lexical accusatives in Icelandic and Baker & Bobaljik 2017 for similar ideas for ergative case). This silent DP can sometimes be realised overtly, as in (27) above. The 'lexical' dative is assigned in the same manner as in ditransitive double-object constructions. The feature [+lr] is assigned to the higher DP at the VP level via ↓Agr↓. The assignment of [+hr] applies at *v*P, by ↑Agr↑, which is established with the nominative DP in Spec*v*P.

These constructions are therefore underlyingly true ditransitives, which explains their striking similarities to regular canonical ditransitive constructions and the similarities in the syntactic behaviour between the datives in the two.

### **4.2.2 An extension:** *Adjust***-type verbs as underlying ditransitives**

Another type of verbs identified by Stipčević (2014: 300f.) select for dative objects where the dative argument mostly has a target person/goal theta-role.

### 16 Unifying structural and lexical case assignment in Dependent Case Theory

Some of the verbs include: *odužiti se* 'pay back', *osvetiti se* 'take revenge', *suprotstaviti se* 'confront', *predati se* 'give in/give up', *oteti se* 'escape', *priključiti se* 'join', *prilagoditi se* 'adjust', etc. Most of these verbs contain the morpheme *se*, which mostly has a reflexive interpretation. The nominative argument is usually an agent in these sentences and the unmarked order is nom ≻ dat (29).

	- b. Srdjan Srdjan.nom se refl predao surrendered.sg.m policiji. police.dat 'Srdjan surrendered to the police.'

Another overt acc argument can be added, but in that case the morpheme *se* cannot appear in the sentence. Comparing (29a)/(29b) with (30a)/(30b) respectively, we can see that *se* and acc seem to be in complementary distribution. *Se* therefore seems to absorb acc case (see also Franks 1995).<sup>7</sup>

	- b. Srdjan Srdjan.nom je is (\*se) refl predao submitted.sg.m dokumente documents.acc policiji. police.dat 'Srdjan submitted the documents to the police.'

	- b. \* Situaciji situation.dat se refl / je is bilo been.sg.n prilagodjeno. adjusted.sg.n intended: 'One adjusted to the situation.'
	- c. Situaciji situation.dat se refl prilagodilo. adjusted.sg.n 'One adjusted to the situation.'

As (i.c) shows, the only possible 'passive' form with these constructions is actually impersonal middle construction, which is expected if these constructions even in the active voice already involve argument reduction (see Progovac 2013; Marelj 2004).

<sup>7</sup>Passivisation is unfortunately inconclusive as a test. Sentences with an overt accusative can be passivized regularly (i.a), but the ones without the overt acc argument and with the *se* morpheme cannot be (i.b).

### Zorica Puškar & Gereon Müller

The similarities between (30) and (29) above can be captured by the derivations in (31) and (32). While verbs with 'structural' dative contain an overt DP as a DO, *adjust*-type verbs contain a silent DP. Crucially, the [+lr] feature is assigned to the higher of the two DPs in the VP. While in (31) the lower DP receives the [+hr] feature and thereby acc case upon merging the external argument, in (32), the lower DP argument in the VP is reduced (or alternatively it starts out as a null DP) and becomes realised by *se*.

### 16 Unifying structural and lexical case assignment in Dependent Case Theory

### **4.2.3** *Belong***-type verbs as unaccusative ditransitives**

Belong-type verbs include verbs such as *pripadati* 'belong', *zapasti* 'get into/end up with', *nedostajati* 'miss', etc. (see also Stipčević 2014). We argue that these verbs are underlyingly ditransitive as well, but they do not take an external argument and are, therefore, unaccusative. The nom argument is usually a theme, while dat is usually interpreted as possessor. The unmarked word order is nom ≻ dat, as illustrated by (33).

(33) Ova this.nom kapa cap.nom pripada belongs Ani. Ana.dat 'This cap belongs to Ana.'

No additional overt accusative arguments can be added to these verbs and a structure like this cannot be passivised (34). The impossibility of passivization, the lack of overt accusative argument and the theme interpretation of the nom argument suggest therefore that such constructions are essentially unaccusative. The idea that the nom argument is introduced as the internal argument of the verb, which is later moved to the sentence-initial position, can be supported by evidence from quantifier scope. In (35), the possibility for the existential quantifier to outscope the universal one indicates that the nom argument has been moved and is able to reconstruct in its base position.<sup>8</sup>


In order to derive this type of lexical dative as dependent case, we assume that the two internal arguments of these verbs are both merged as the arguments of V, as in (36). In this configuration, ↓Agr↓ applies first and the higher DP receives the [+lr] feature from the lower one. The lower DP does not receive any case features at the VP level. Since these verbs are unaccusative, no external argument is merged in Spec*v*P. However, the theme argument must move up in order to

<sup>8</sup>This situation mirrors the one in (8b). Note that since Serbian is a rigid scope language, only movement can affect quantifier scope, thus the reading here cannot be derived by quantifier raising of the existential quantifier and must instead involve movement (see Antonyuk 2015).

become the (derived) subject of the sentence. In order to move to SpecTP, it has to move through the *v*P phase edge (Legate 2003). At the *v*P edge, this DP can now serve as a case competitor again. After ↓Agr↓ fails due to the ergative switch-off parameter that precludes case valuation in Spec*v*P, ↑Agr↑ succeeds, and [+hr] is assigned to the dat DP (37).

In conclusion, treating these constructions as unaccusatives correctly captures the fact that they cannot passivize and that the DPnom is interpreted as a theme rather than agent, thereby enabling a unified treatment of lexical and structural dative as dependent case.

### **4.3 An extension: (***feel***)-***like***-type verbs as unaccusative ditransitives**

(*Feel*)-*like*-type verbs select for an experiencer-type dative argument, as in (38). The unmarked word order seems to be dat ≻ nom.

### 16 Unifying structural and lexical case assignment in Dependent Case Theory

(38) Ani Ana.dat se *se* svidja appeals zelena green.nom haljina. dress.nom 'Ana likes the green dress.'

As with the previous group, no additional overt accusative arguments can be added to this structure. Moreover, a structure like this cannot be passivised (39).

(39) \* Ani Ana.dat je is bilo been svidjano. appealed.sg.n 'It was appealed to Ana.'

The lack of passivization possibility and the overt accusative argument, together with the theme interpretation of the nom argument suggest that this could be an unaccusative contruction. The *se* clitic, however, does not have a reflexive interpretation, but following Progovac (2013), it can be assumed to be an expletive object pronoun. Based on the fact that these verbs cannot assign accusative and that the DPnom is ambiguous between subject and object interpretation, Progovac (2013) argues that the structures like these are in fact instances of an ergativeabsolutive pattern in a language like Serbian. Such sentences would be analysed as in (36) and (37) above. The [+lr] feature is assigned to the higher DP at the VP level via ↓Agr↓, while the [+hr] feature is assigned at the *v*P level via ↑Agr↑. We leave the exact nature of the clitic *se* in these constructions for future research, which should be able to tell whether it is an additional silent argument that absorbs certain case features, or whether it is an expletive.

### **5 Conclusion**

Dependent case assignment can be formalised by means of a derivational approach, where case features are assigned incrementally, via an Agree operation which holds between two DPs. dat is assigned as high dependent case in the VP, while acc is the low dependent case in the *v*P. We have seen evidence from Serbian that the account of structural dat can be extended to cover the assignment of lexical dat. Lexical dative is thus assigned in the same configurations: (i) in a ditransitive double-object construction with a silent DP as DO and a case competitor, (ii) in a double object construction involving an unaccusative verb. In its strictest form therefore, the Dependent Case Theory can capture assignment of both lexical and structural dative case as dependent case.

Zorica Puškar & Gereon Müller

### **Abbreviations**


### **Acknowledgements**

For their helpful comments, suggestions and feedback, we would like to thank two anonymous reviewers, the editors, as well as the audiences at FDSL 12 and at the University of Leipzig. This work was completed as part of the DFG-funded graduate school Interaktion Grammatischer Bausteine 'Interaction of Grammatical Building Blocks' (IGRA).

### **References**

Abels, Klaus. 2012. *Phases: An essay on cyclicity in syntax*. Berlin: de Gruyter.


### 16 Unifying structural and lexical case assignment in Dependent Case Theory


### Zorica Puškar & Gereon Müller


16 Unifying structural and lexical case assignment in Dependent Case Theory

*guistics 21: The Third Indiana Meeting 2012*, 246–259. Ann Arbor, MI: Michigan Slavic Publications.


### **Chapter 17**

## **Transitivity Requirement revisited: Evidence from first language acquisition**

### Teodora Radeva-Bork

University of Potsdam

The paper investigates null objects in early child grammar in light of the Transitivity Requirement approach (Cummins & Roberge 2005), which states that transitivity is not dependent on the lexical features of the verb but is a universal grammatical property. I review naturalistic and experimental child data from sixteen typologically different languages (including five Slavic representatives) and show that the predictions of the Transitivity Requirement approach are not borne out. Instead, the results suggest that early object omissions reflect the presence of (optional) object drop in the target grammar. Children seem to omit objects only if the target grammar allows for this option, as it is the case, for example, in Russian, Ukrainian and Polish.

**Keywords:** null objects, child grammar, Transitivity Requirement, Slavic, crosslinguistic data

### **1 Introduction and preliminaries**

While the study of null subjects in Slavic has received much attention (Franks 1995, Lindseth 1998, Fehrmann & Junghanns 2008, Müller 2006, among others), null / missing / implicit direct objects still constitute an under-researched area and the distribution of object drop is still not uniformly capturable. Object drop has not been used extensively as a way to classify languages in a typology. In other words, whereas it is common to talk about pro-drop or null subject languages, references to "object drop languages" or "null object languages" are much less frequent in the literature. One important reason for this classificatory asymmetry is that object drop appears to be much more variable than subject drop. Most attempts to identify a common denominator for null objects have failed in

Teodora Radeva-Bork. 2018. Transitivity Requirement revisited: Evidence from first language acquisition. In Denisa Lenertová, Roland Meyer, Radek Šimík & Luka Szucsich (eds.), *Advances in formal Slavic linguistics 2016*, 381–400. Berlin: Language Science Press. DOI:10.5281/zenodo.2545539

### Teodora Radeva-Bork

cross-linguistic terms. Possible restrictions on object drop have been discussed previously, such as, for instance, overt morphological verb-object agreement, which holds for Swahili or Georgian but not for Russian or Chinese; topic drop, holding for German but not for other null-object languages; as well as other conditions like specific structural contexts favouring the appearance of null objects (e.g. sequence of verbs or imperatives). Generally, it is assumed that null objects are a licit option in the grammars of Russian, Polish, to some extent German, European and Brazilian Portuguese, and Chinese, among other languages. Languages such as Bulgarian, Serbo-Croatian or Spanish, on the other hand, disallow null objects.

In this paper, I examine the omission of referential, definite objects as in (1), a type that happens to be ungrammatical in English (1a), but grammatical in other languages, such as Russian (1b). What I leave aside are non-referential null objects, illustrated by (2). For a discussion of the licensing of object drop of indefinite DPs in European Spanish, Modern Greek, and Bulgarian see Campos (1986), Giannakidou & Merchant (1997), and Dimitriadis (1994). See also Dvořák (2017) for a recent in-depth discussion of indefinite and generic null objects in Czech. For the sake of terminological clarity, I use null objects to refer to the phonological non-realisation of direct objects in transitive contexts, as in (1). (Other common terms include "object omission" or "object drop".)

### (1) **Referential/Definite null object**

	- B: \* I read Ø.
	- B: Polivaju water.1sg Ø. / Ja I polivaju water.1sg ego. it 'I'm watering it.'

### (2) **Non-referential/Indefinite null object**


Object realization or omission have both a syntactic component (what kinds of mechanisms govern the licensing and recoverability of null objects) and a lexical component (what types of verbs allow optional realization of their direct object

### 17 Transitivity Requirement revisited: Evidence from first language acquisition

argument). In this paper, I concentrate on a syntactic approach to transitivity, based on the so-called Transitivity Reqirement (TR) proposed by Cummins & Roberge (2005). In parallel to the Extended Projection Principle (EPP) for subjects, it suggests that the direct-object position is given by Universal Grammar and is not dependent on the lexical features of the verb. The syntactic analysis of null objects is particularly appealing as it provides very concrete and testable predictions about transitivity development in first language acquisition. Under the TR, null objects are predicted to be a part of the default initial setting for acquisition purposes. This view is advocated in Pérez-Leroux et al. (2008), who suggest that children start out with a null cognate object default, and that the initial referential properties of this null cognate object are broader than in the target grammar.<sup>1</sup> Experience serves to block, or narrow down, the referential semantics of the null default. It follows from this that we should be able to find evidence of object omissions in the early stages of language development in typologically different languages, irrespective of the availability of null objects in the target grammar.

The main agenda of this paper is to evaluate the empirical validity of the TR by examining acquisition data from sixteen typologically different languages, including five Slavic representatives; see Table 1 and Table 2. I review the results from studies carried out on these languages with the aim to examine the object (non)omission in the early stages of grammar, especially in light of the TR. Such a secondary approach to primary data is justified since as more research on a given topic within a particular language family emerges, it is valuable to have research that consolidates the studies and elucidates similarities and differences across language families. The Slavic perspective is particularly interesting since Slavic languages vary with respect to the availability of object drop although they share a number of common morphosyntactic features. Additionally, language acquisition in Slavic is still under-researched compared to other languages, and this paper aims to contribute to the cross-linguistic investigation of the early development of objects by presenting and reviewing child data from Slavic.

The paper is organised as follows. §2 sketches some theoretical approaches to object omission, focusing on the discussion of the syntactic transitivity approach

<sup>1</sup>Since different languages have different conditions as to where objects are allowed to remain unpronounced, Pérez-Leroux et al.'s (2008) approach is to seek for a common denominator in null object constructions. On the basis of French and English data, they identify a null bare N object to be the common denominator in English and French. The authors suggest that by postulating this common denominator as the minimal default, they can make inferences about what development is required to attain the proper distribution and features of null objects in a given target.

### Teodora Radeva-Bork

by Cummins & Roberge (2005) and outlining the predictions of this analysis for the acquisition of objects, with respect to the object omissions children are predicted to show. In §3, I discuss experimental and naturalistic child data from Russian, Serbo-Croatian, Bulgarian, Polish, Ukrainian, French, English, Spanish, Catalan, Italian, European Portuguese, Brazilian Portuguese, Romanian, Standard Modern Greek, Cypriot Greek, and Chinese. The participants in the studies are typically-developing monolingual children with the core age 2–4 years, as well as 4–6 years for some languages (for the detailed data description and methodology, see §3.1). The survey of the data shows that the predictions made by the TR are not borne out, and null objects are not a default setting in the early stages of grammar. Based on the empirical findings, I suggest that there is a strong link between children's object omissions and the grammaticality of null objects in the target grammar. This view is compatible with the proposal made in Varlokosta et al. (2016), suggesting that children generally opt for the weakest alternative on the scale pronoun > clitic > null, depending on what is available in their language. Of course, this proposal needs further investigation in studies that test *different* types of objects, i.e. full pronouns, clitics and full DPs.

### **2 The Transitivity Requirement and its predictions for child grammar**

To start off, I briefly sketch the lexically and syntactically motivated approaches to argument structure, with special emphasis on the syntactic transitivity approach by Cummins & Roberge (2005) and its prediction for the development of (direct) objects in the early stages of grammar.

Verbs are flexible as to which and how many argument positions they project (van Hout 2012: 25). According to the lexical approach, the verb's flexibility is incorporated into its lexical representation, i.e. the verb is lexically represented with more than one representation, each of which is linked to a certain verb subcategorization frame (Chomsky 1965; Emonds 1991). In the case of English generic null objects, Rizzi (1986) assumes that theta roles can be fully saturated in the lexicon. Other, discourse-motivated approaches, such as Groefsema (1995) and Fellbaum & Kegl (1989), associate the use of certain null objects such as generic (non-referential) null objects, cf. (2), with discourse factors and pragmatic considerations.

An alternative analysis is provided by the modular account relying on a strictly syntactic approach to the occurrence of null objects. The Transitivity Requirement (TR) by Cummins & Roberge (2005), parallel to the Extended Projection

### 17 Transitivity Requirement revisited: Evidence from first language acquisition

Principle (EPP) for subjects, suggests that the direct-object position is given by Universal Grammar and is not dependent on the lexical features of the verb. Thus, the direct-object position is not seen as a characteristic depending on the lexicalsemantic features of the verb, but rather as an integral, essential element of the predicate. Under the TR, transitivity is viewed as a universal grammatical property. Null objects are structurally present, and all VPs (i.e. with transitive, unergative, unaccusative verbs, etc.), contain an object position that can be overtly expressed or not (Cummins & Roberge 2005). When an object is not phonologically realized, it remains as a null object in the VP.

Under the TR, (3) is considered to be a universal structural template for objects (Cummins & Roberge 2005; Pérez-Leroux et al. 2008). This template is shown in the tree below, where N is an implicit null object.

The main premises of the TR-based approach, namely that (i) transitivity is a universal grammatical property and (ii) null objects are by default structurally represented in all languages, provide a fruitful ground for making precise predictions about the initial states of human grammar. If null objects are present by default, we should expect children to go through a stage of object optionality (cf. Pérez-Leroux et al. 2008), irrespective of the object-drop capacity of the specific target grammars. An overgeneralization of the free availability of null objects due to a failure to restrict the null structure to the appropriate context is predicted. Omissions should therefore be found in typologically different languages, irrespective of the availability of null objects in the target grammar and without reference to the pronominal system of the specific language. For example, objects are expected to be dropped in the early development of languages with and without clitic systems (such as Bulgarian and English, for example). Such a prediction is particularly challenging since clitic pronouns are generally prohibited from dropping. The emerging research question, namely whether children of all languages go through a null object stage, is addressed in the next section, presenting empirical data from sixteen languages.

### Teodora Radeva-Bork

### **3 Null objects in child grammar**

In order to test the validity of the predictions made by the TR, I turn to the examination of how children acquiring various languages deal with direct objects in the acquisition process. The comparison of developmental patterns in typologically different languages such as Russian, Greek, French, and Chinese, to name only a few, allows to make hypotheses about universally represented structures at the starting point of linguistic development and about grammatical elements that are specific to particular languages. More importantly, a comprehensive survey of studies conducted on the acquisition of objects in different languages, which summarizes and compares the derived results, can test the predictions made by the TR that children of *all* languages go through a null object stage.

### **3.1 Data**

I review data from experimental studies (see Tables1 and 2), concerned with both the production and comprehension of direct objects in elicited and naturalistic environments. The data stem from the studies on sixteen typologically different languages. The focus of the present paper is on the five Slavic representatives: Russian, Serbo-Croatian, Bulgarian, Polish and Ukrainian, but the data are also


Table 1: Reviewed studies on the acquisition of objects (Slavic languages)

### 17 Transitivity Requirement revisited: Evidence from first language acquisition


### Table 2: Reviewed studies on the acquisition of objects (non-Slavic languages)

### Teodora Radeva-Bork

placed in a cross-linguistic context by comparing the five Slavic languages to eleven other languages, for which object drop has been studied, namely French, English, Spanish, Catalan, Italian, European Portuguese, Brazilian Portuguese, Romanian, Standard Modern Greek, Cypriot Greek, and Chinese. Tables 1 and 2 give an overview of the languages and the conducted studies, including information about the type of data, i.e. elicited or/and spontaneous, as well as about the ages of tested children.<sup>2</sup> For French, English, Spanish, and Italian, there is a greater number of studies than for other languages, so only a selection of the most recent and representative studies could be included here.

The overview of studies shows that the acquisition of objects has been well examined over the last three decades, with studies covering a vast number of languages and providing both spontaneous and elicited child data from production and comprehension, something which is rather rare in the assessment of acquisition of other grammatical phenomena. This is particularly beneficial for the present goals, since the TR-based approach predicts object drop in the early stages of language acquisition irrespectively of typological differences found in individual language systems.

Here, I analyse production and comprehension data from Polish, French, English, and Spanish. For Russian, Serbo-Croatian, Bulgarian, Ukrainian, Catalan, Italian, European Portuguese, Brazilian Portuguese, Romanian, Standard Modern Greek, Cypriot Greek, and Chinese, I deal with production data in elicited and spontaneous contexts. The core age of the participants in the studies lies between two to four years, with some languages (Russian, Polish, French, English, and European Portuguese) including older children, four to six year old, in some of the studies. In the majority of the studies participants are controlled for gender. The subjects are typically-developing, monolingual children, recruited from day cares or schools.

The comparison of results from the included studies is legitimate due to the use of a conform and highly comparable experimental methodology, which is described in the next paragraph. In fact, in a recent analysis of meta-megastudies, Myers (2016) shows that methodological differences across studies seem generally insufficient to explain large differences in results, and that what seems to have a bigger effect are typological differences between languages. Whereas a detailed discussion of methodological effects in object elicitation tasks is beyond the scope of this paper, I hold that it is legitimate to compare the results from the

<sup>2</sup>Ages are given in years and months, i.e. 1;9 indicates 1 year and 9 months of age.

### 17 Transitivity Requirement revisited: Evidence from first language acquisition

presently included studies mainly due to the use of a common elicitation procedure. However, see Varlokosta et al. (2016), who argue for an effect of the used elicitation methodology on the production of clitic objects in experimental tasks.

Studies on the acquisition of objects employ a standard elicited production task (Schaeffer 2010; Pérez-Leroux et al. 2008; Radeva-Bork 2012; among others) to examine how children use direct objects in transitive contexts of the kind found in (4), where (4a) is a licit option in the adult grammar of some languages, such as Russian or Polish, but not in others, such as Bulgarian or Serbo-Croatian. Examples (4b) and (4c) represent the grammatical choices for Bulgarian, making use of a full NP/pronoun or a clitic, respectively.

	- a. \* Toj he ritna kicked Ø. (Bulgarian) Intended: 'He kicked it.'
	- b. Toj he ritna kicked topkata ball.f.def / neja. her.acc 'He kicked the ball.' / 'He kicked it.'
	- c. Toj he ja it.cl ritna. kicked 'He kicked it.'

In such elicitation tasks, participants are shown simple act-outs with toys and props, or picture cards illustrating simple activities, such as kicking a ball, drawing a flower, or building a house. Every activity represents a transitive scenario with a subject and an object. The studies involve a big number of test items, usually between six and twelve. After the visual prompt, participants hear a control question of the kind *What did X do?* without the target object being mentioned. Depending on the specificities of the language, target answers contain a transitive structure with an overt object or with its omission, cf. (4). Transitive verbs such as *kick*, *draw*, *build*, *give*, *hug*, *drink*, *hit, push* etc. are elicited in the tasks. A screening prior to the study guarantees that the children understand the object nouns and the verbs denoting the actions in the tasks. An example of a model elicitation of a direct object is given in (5). The use of an overt object is obligatory here. Similar tasks have been used in the elicitation studies presented in Tables 1 and 2. For the spontaneous data, recordings and transcripts are used.

### Teodora Radeva-Bork

### (5) **Model elicitation of direct objects in Bulgarian**

```
Experimenter 1:
```
'This is Maria. This here is her favourite doll. The doll's hair is so bushy.' (utterance accompanied by an act-out of the experimenter combing the doll)

Experimenter 2:

Kakvo what napravi did Maria? Maria 'What did Maria do?'

### Child 2;6:

Sresa combed kuklata. doll.def

'She combed the doll.' (adapted from Radeva-Bork 2012: 79)

### **3.2 Results**

An analysis of the obtained results shows that there is a high degree of variation across languages when it comes to object omission in early grammars. Since it is impossible to give a detailed presentation of the results from the individual studies in this paper, I focus on the Slavic data (marked in bold in Table 3), and present the results from the other languages for the sake of cross-linguistic comparison.

Table 3: General results for the spread of object (non)omission.


Generally, we find evidence of object omission in Russian, Ukrainian, Polish, European Portuguese, Brazilian Portuguese, Chinese, Italian, and Catalan, but not in Bulgarian, Serbo-Croatian, Spanish, Modern Greek, Cypriot Greek, and

### 17 Transitivity Requirement revisited: Evidence from first language acquisition

Romanian. Children in the latter group produce their obligatory objects in transitive contexts from the early stages of language development in a target-like manner. In contrast, Russian, Ukrainian, Polish, European Portuguese, Brazilian Portuguese, Chinese, Italian, and Catalan undergo a stage of object omission, in which obligatory transitive contexts do not yield an object in the early stages of first language acquisition. Regrettably, I had to put French and English aside, since the individual studies on each of these languages yielded contrasting results with respect to how much object omission was found in children. Table 3 summarizes the main results from the studies on the sixteen languages under analysis.

Let me discuss the results in more detail. Although results from individual studies on Spanish vary as to how much omission is found in the early stages, all of the studies support the view that Spanish objects are acquired early, around the age of two to three years. On the basis of the elicitation data from 28 children, Wexler et al. (2004) show that two-year-olds literally never omit objects (omission is at 0%). These results are consistent with the spontaneous data provided in Stiasny (2006). In contrast to Spanish, for Catalan Wexler et al. (2004) find high rates of object omission. Two-year-olds omit objects 74% of the time. The object omission remits as age progresses but does not disappear by the age of four years.

Italian patterns with Catalan with respect to object omission – the rate of object omission is high in both languages for ages two to four. Object omissions in Italian have been evidenced both in spontaneous speech (a.o. Guasti 1993) as well as in elicitation data (Schaeffer 2010). The two-year-olds in Schaeffer's study omit objects at high rates of up to 64%. Object omission at 15% is still present in the production of three-year-olds. These findings are confirmed by similar rates of object omission for the same ages in Tedeschi (2009). It is not before the age of four that Italian children cease omitting their objects and omissions fall to 0%. So whereas Spanish children produce overt objects from the early on, Italian children go through an initial phase of object omission (ending at around four years).

In an experimental study for Romanian, Babyonyshev & Marin (2006) find that Romanian-speaking children "produce object clitics freely as soon as they are able to produce utterances that are long enough to contain them" (p. 31). The authors divide their population into groups according to MLU and not according to age.<sup>3</sup> The results indicate object omission of 82% for children with MLU

<sup>3</sup>MLU refers to Mean Length of Utterance, a technique often used in L1 acquisition research to measure the complexity of a child's speech by calculating the number of words (or morphemes, on some approaches) per utterance.

### Teodora Radeva-Bork

smaller than two, and omission of 13% for children with MLU greater than two. Since Babyonyshev and Marin show that object omission in Romanian is due to production limitations (such as low MLU) instead of a grammatical constraint, we can conclude that the initial stage of language development in Romanian is not characterized by object omission.

When it comes to Slavic languages, Bulgarian and Serbo-Croatian pattern alike since the children in the studies did not omit objects (Radeva-Bork 2013; 2015; Stiasny 2006). No object omission or misplacement has been found in Serbo-Croatian in either elicited or naturalistic production (Stiasny 2006). The same holds for Bulgarian; objects do not get omitted and are used in a target-like manner already around the age of 2;3 on (Radeva-Bork 2015). If we compare Italian and Bulgarian, we see that Italian two-year-olds omit objects 64% of the time (Schaeffer 2010), whereas their Bulgarian peers omit objects only about 30%, so about half as much as in Italian. Null objects fully disappear in Bulgarian towards the end of year three, which is not the case in Italian. Therefore the study results clearly indicate the lack of object omission in the acquisition of Bulgarian. In contrast, in Polish, Ukrainian and Russian, null objects are the preferred option for children (Tryzna 2015; Mykhaylyk & Sopata 2016; Gordishevsky & Avrutin 2004; Frolova 2016). In Polish and Ukrainian, children prefer to use null arguments up to the age of five. At the age of three they omit objects at 89% in Polish and at 68% in Ukrainian (Mykhaylyk & Sopata 2016). The onset of direct object use seems to be semantically affected since around the age of five, clitics/pronouns are used more often for animate referents, and it is only around the age of six that they start being used also for inanimate objects (Mykhaylyk & Sopata 2016).

In Russian, Ukrainian, and Polish, children do not only omit direct objects in obligatory transitive contexts, but they overproduce the null option when compared to adults (in the contexts where NO is allowed). This holds particularly for Russian, where three- to six-year old children produce more null objects than adults in the contexts where object omission is a grammatical possibility. Object omission at around 80% was found for the age of three years (Frolova 2016). Even at the age of five, Russian children omit referential objects at 73% and non-referential ones at 54%. As Frolova (submitted) shows, Russian children even omit direct objects in strongly transitive (perfective) contexts where adults tend to use overt nouns but where the null object is still grammatical. Generally, production of null objects in Russian is attested at a similar rate across all age groups up to the age of six, and it is higher than for adults (Frolova 2016). In non-referential contexts, a gradual decrease in object drop, an increase in lexical object (i.e., full DP object) use and a low production of pronouns is observed

### 17 Transitivity Requirement revisited: Evidence from first language acquisition

with the age progression. The rate of null objects is higher in referential contexts, where we rarely find lexical objects while the percentage of pronouns is higher. Similarly to their Russian peers, children acquiring Polish overuse null objects in comparison with adults, and the omission rate decreases as language development progresses (Mykhaylyk & Sopata 2016; Tryzna 2015).

From a cross-linguistic perspective, Chinese, European Portuguese, Brazilian Portuguese, Italian, and Catalan pattern with Russian, Ukrainian, and Polish in terms of the attested object omission in the early stages (for ages two to four and above). Spanish, Modern Greek, Cypriot Greek, and Romanian behave like Bulgarian and Serbo-Croatian in that they are not characterized by object drop in the acquisition process, and objects are present already at the age of two. The latter finding is in contradiction with the predictions made by the Transitivity Requirement (see §2).

### **3.3 Discussion and implications**

The data survey from sixteen typologically different languages (including five Slavic representatives: Bulgarian, Serbo-Croatian, Russian, Ukrainian, and Polish) challenges the obligatory structural presence of null objects postulated by the TR, and calls for re-evaluation of this theoretical analysis of the null object phenomenon in adult grammars. The prediction made by the Transitivity Requirement that children of *all* languages should go through a null-object stage is not borne out – out of the sixteen languages, eight allow object omission in early grammar, six languages do not, and two languages (French and English) show conflicting results. Therefore, there is no evidence that null objects are a default initial setting for acquisition purposes. Instead, there seems to be a clear division between languages with and without object drop already in the early stages.

How can the division between languages in terms of object (non)omission be accounted for? Based on the results presented in §3.2, a parallel between children's performance and the actual permission or prohibition of object drop in the target grammars emerges. Children omit objects only if their target grammar provides the null object option, which is the case for Russian, Ukrainian, Polish, European Portuguese, Brazilian Portuguese, Italian, Catalan, and Chinese. In contrast, Bulgarian, Serbo-Croatian, Spanish, Modern Greek, Cypriot Greek, and Romanian do not allow object drop, in the sense of example (1), and children seem to act according to the target grammar rules and produce objects from early on. Hence, early object omissions seem to reflect the presence of (optional) object drop in the target grammar. Children overgeneralize novel intransitives out

### Teodora Radeva-Bork

of novel transitives and drop objects at higher rates than adults, provided that their target grammar has that option. They seem to be faithful to the syntax of the input. This observation is generally supported by experimental evidence in the language acquisition literature, indicating strong input sensitivity in acquisition and target-like omissions in spontaneous data (Ingham 1993). In addition, the data discussed here supply support to the proposal in Varlokosta et al. (2016) that children generally opt for the weakest alternative, in accordance with the scale pronoun > clitic > null, depending on what is available in their language.

Children seem to be faithful to the syntax of the input as their object drop reflects the presence of (optional) object drop in the target grammar and gives no evidence that null objects are a default setting for all languages. Furthermore, for the languages in which children omit objects, they seem to overgeneralize the null option. Data from Chinese as well as from European and Brazilian Portuguese confirm that children tend to overuse the option of object-dropping, licensed by their target grammar in some contexts, as late as at the age of five (Wang & Lillo-Martin 1992, Costa et al. 2012, Lopes 2009). In addition, it seems that if a null argument is available in the grammar, the discourse-pragmatic or semantic features of the direct object referent play an important role in argument realization. This is supported by studies showing a semantic effect on the use of direct objects, for example in Polish, where overt objects (clitics/pronouns) are used more often for animate referents around the age of five. Around the age of six, they are used for inanimate referents. It may be the case that null objects are different from null subjects in that semantic and discourse factors play a greater role in the presence and interpretation of the null object. This, however, needs further investigation.

### **4 Conclusion**

The aim of this paper was to investigate object omission in early child grammar in light of the Transitivity Requirement (TR) approach (Cummins & Roberge 2005), which states that transitivity is not dependent on the lexical features of the verb but is a universal grammatical property. Within this approach, null objects are predicted to be a default initial setting for language acquisition. If null objects are indeed default, we expect to find evidence for object drop in the early stage of development in various languages, irrespective of the (non)omission capacity of the specific target grammars.

The paper reviewed naturalistic and experimental child data from sixteen typologically different languages and showed that out of the sixteen languages,

### 17 Transitivity Requirement revisited: Evidence from first language acquisition

eight languages (Russian, Ukrainian, Polish, European Portuguese, Brazilian Portuguese, Italian, Catalan and Chinese) allow object omission in early grammar, six languages (Bulgarian, Serbo-Croatian, Spanish, Modern Greek, Cypriot Greek, and Romanian) do not, and two (French and English) show conflicting results. The predictions of the TR approach are not borne out and the idea of null objects being a default setting in the early child grammar is invalidated. Instead, there is a clear division between languages with and without object drop in the early stages. In fact, the results from the studies suggest that early object omissions reflect the presence of (optional) object drop in the target grammar. In other words, children seem to omit objects only if their target grammar allows for this option, as it is the case, for example, in Russian, Ukrainian and Polish.

### **Abbreviations**


### **Acknowledgements**

I am grateful to two anonymous reviewers for their insightful comments.

### **References**


### Teodora Radeva-Bork


### 17 Transitivity Requirement revisited: Evidence from first language acquisition


### Teodora Radeva-Bork


### 17 Transitivity Requirement revisited: Evidence from first language acquisition

Esther Ruigendijk (eds.), *Language acquisition and development: Proceedings of GALA 2013*, 360–379. Newcastle upon Tyne: Cambridge Scholars Publishing.


### Teodora Radeva-Bork

Veenstra, John Weston, Maya Yachini & Kazuko Yatsushiro. 2016. A crosslinguistic study of the acquisition of clitic and pronoun production. *Language Acquisition* 23(1). 1–26. DOI:10.1080/10489223.2015.1028628


### **Chapter 18**

## **Number agreement mismatches in Russian numeral phrases**

### Elena Titov

University College London

This paper looks at two cases of number agreement mismatch in Russian numeral phrases and offers a unified syntactic analysis for both. One case relates to examples where a higher numeral that typically selects a plural NP fails to do so when the head noun lacks a singular lexical form. Instead, an NP headed by a noun that lacks a plural lexical form is chosen despite the selectional requirement of the numeral. The second case concerns data discussed in Franks & House (1982) that involve topicalization of a complement of a lower numeral, which consistently selects a singular NP, with the topicalized NP unexpectedly appearing in the plural form.

**Keywords:** Russian numeral phrases, number agreement, syntax, morphology, contrastive topicalization, information structure

### **1 Genitive of quantification**

Russian numerals are traditionally subcategorized into two groups depending on the number feature carried by the head of the NP they select. The first group of the so-called lower numerals includes numerals from 2 to 4, which consistently select a complement headed by a noun in the singular genitive form, as in (1). The second group of higher numerals includes numerals from 5 and above, which select a complement headed by a noun in the plural genitive form, as in (2).

(1) dva two studenta student.gen.sg 'two students'

Elena Titov. 2018. Number agreement mismatches in Russian numeral phrases. In Denisa Lenertová, Roland Meyer, Radek Šimík & Luka Szucsich (eds.), *Advances in formal Slavic linguistics 2016*, 401–426. Berlin: Language Science Press. DOI:10.5281/zenodo.2545541

### Elena Titov


The present paper is concerned with both types of numeral phrases given in (1) and (2) but we start by looking at constructions involving higher numerals, as in (2). The traditional way of analysing (2) is to say that the higher numeral behaves like a noun in the genitive construction, as in (3). That is, the numeral is the head of the NumP taking the quantified NP as its complement and assigning genitive plural to it, so that there is no structural difference between (2) and (3); see (4) and (5).<sup>1</sup>

Curiously, the parallel in the case and number features observed in NumPs headed by a higher numeral and genitive constructions in (2) and (3), respectively, only holds for those NP complements whose head noun has both lexical number forms

<sup>1</sup>Although (4) represents the most standard approach to NumPs headed by a higher numeral, other analyses exist. One such analysis assumes that the higher numeral is merged in the highest position within the NP and moves to D (Pesetsky 2013). As postulation of the D layer for Russian NPs is rather controversial (see Bošković 2008; 2010), I adopt a more standard representation of NumPs that essentially assumes the same surface hierarchical structure. Whether the numeral has moved to its surface position from within the NP or is generated in it is immaterial for the present analysis. Another analysis proposed in the literature is based on the observation that a higher numeral can undergo left-branch extraction, and can also receive a case from the outside when its complement receives genitive (as in the Russian *po*-construction). To account for this observation, it has been proposed that the numeral is located in the Specifier of a null head, which itself assigns genitive (Franks 1995, Bailyn 2004). For the purpose of the present analysis, it is immaterial whether the numeral is the head of the numeral phrase that assigns genitive, as in (4), or if it is located in the Specifier of a null head that assigns genitive. The analysis in (4) is adopted here mainly for the ease of exposition.

### 18 Number agreement mismatches in Russian numeral phrases

– plural and singular. Although such nouns constitute the overwhelming majority of Russian nouns, there are exceptions. Thus, the Russian noun *čelovek* 'person' only has a singular lexical form, whereas the noun *ljudi* 'people' only has a plural lexical form.<sup>2</sup> As expected, the case-assigning noun in the genitive construction in (6) can select an NP headed by the noun that only has a plural lexical form, see (6a), but not the noun that only has a singular lexical form, see (6b). What is unexpected is that the NumP headed by a higher numeral behaves in the exactly opposite way, see (7). Despite the fact that the higher numeral typically takes a plural NP complement, this NP cannot be headed by a noun that lacks a singular lexical form, see (7a). Instead, an NP headed by a noun that lacks a plural lexical form is selected, see (7b). As a result, the selected NP fails to carry the genitive plural features, and the noun surfaces in the form that is morphologically identical to the default nominative singular form.<sup>3</sup>

<sup>3</sup>The fact that the noun in (7b) surfaces in the form identical to the nominative singular form is in line with the idea that nominative is a morphological default (Marantz 1991, Schütze 1997; 2001). Although languages may differ in the realization of default case, in Russian it is indeed nominative. Thus, the Russian variant of the English phrase *Me intelligent⁈* can only contain a nominative noun. Plausibly, the morphological form of the noun in (7b) is a historical remnant of the old declension paradigm from the time when *čelovek* had both number forms, with the nominative singular and the genitive plural forms coinciding. However, since in modern Russian the plural form is no longer available for *čelovek*, the morphological form of this noun in the context of a higher numeral must have been reanalysed as the default nominative singular form that surfaces due to the morphological deficiency of *čelovek* (see the notation in (7b)). Additional support for this view comes from the fact that *čelovek* is not the only noun that is reanalysed in modern Russian as nominative due to the genitive-nominative syncretism. Russian feminine nouns whose nominative plural and genitive singular forms coincide can be construed as nominative plural in the context of a lower numeral thereby affecting the choice of case form of the modifying adjective, see (i.a). The genitive singular form is also available for these nouns in modern Russian but is less common, see (i.b).


Since both lexical number forms, singular and plural, are available for the noun *devočka* in modern Russian, both structures in (i) are possible. Logically, if one of the lexical number forms disappeared, only one structure in (i) would remain. Plausibly, this is exactly what happened to the noun *čelovek*.

<sup>2</sup>Due to the fact that *čelovek* and *ljudi* have distinct roots and are historically derived from distinct nouns, I assume that they are distinct lexical items. Importantly, an analysis that assumes that *ljudi* involves contextual root allomorphy of *čelovek* in the context of a higher numeral cannot be sustained because in some contexts, either of the two nouns can surface (see footnotes 13 and 14).

Elena Titov

	- group person.nom.sg person.gen.sg
	- b. vosem' eight čelovek person.nom.sg 'eight people'

The difference in the choice of the noun form illustrated in (6) and (7) strongly suggests that the structural case assigned by a higher numeral is not identical to the lexical case assigned by a noun in the genitive construction. It has been proposed in the linguistic literature that Russian higher numerals assign the socalled genitive of qantification (GQ) rather than simple genitive (Bošković 2006). If so, we can hypothesise that GQ places a specific requirement on the head of the NP, which results in the pattern observed in (7). In particular, being a quantificational case, GQ may require that the NP receiving it is headed by a noun that has a lexically realised unit for counting, see (8). Nouns that do not have a singular lexical form will, then, be expected to fail to head an NP that receives GQ, as such nouns lack a lexically realised unit for counting.<sup>4</sup>

<sup>4</sup>The hypothesis put forward in (8) is additionally supported by data involving mass nouns, as in (i) and nouns belonging to the group of pluralia tantum, as in (ii). Both types of nouns lack a unit for counting and, hence, fail to head the NP that received GQ from the higher numeral, see (i.a) and (ii.a). The only way these nouns can occur in NumPs headed by a higher numeral is when they head an NP that receives genitive from the noun that has a lexical singular form and therefore can head the NP that receives GQ from the higher numeral, as in (i.b) and (ii.b).


It is of course true that in English pluralia tantum also fail to head NP complements to numerals. However, since the present paper is on Russian, a discussion of English is left for future research. Another issue that has to be left for future research is that although structures like

### 18 Number agreement mismatches in Russian numeral phrases

(8) NPs headed by a noun that lacks a unit for counting are unable to carry GQ.<sup>5</sup>

If the rule given in (8) is correct, Russian higher numerals have a difficult time dealing with nouns that lack one of the lexical number forms. We have seen in (2) that higher numerals require plural agreement with their NP complement. At the same time, (8) demands that the relevant NP is headed by a noun that has a singular lexical form. When the head noun has both lexical number forms, both of these requirements can be obeyed, as in (2). Conversely, when the head noun has only one of the number forms, as is the case with *čelovek* and *ljudi* in (7), a choice must be made as to which requirement is obeyed at the cost of violating the other, given that both of them cannot be obeyed simultaneously. The data in (7) demonstrate that Russian choses to obey (8) at the cost of violating the requirement for plural agreement. That is, the noun in the well-formed structure in (7b) has a singular form. The NP it heads can therefore receive structural GQ from the numeral. However, this noun lacks a plural form. It therefore fails to realise the genitive plural features required for agreement with the higher numeral and surfaces in the default nominative singular form.

Following Bobaljik (2008), I assume that morphological case (m-case) must be distinguished from structural case, with m-case being treated as a morphological phenomenon applying at PF and structural case as syntactic NP licensing (see also Harley 1995, Marantz 2000, McFadden 2004, Schütze 1997, Sigurðsson 1991, Sigurðsson 2003, Yip et al. 1987, Zaenen et al. 1985). Assuming that the proper place of agreement, which is dependent on m-case, is the morphological component that is a part of the PF interpretation of structural descriptions (Bobaljik 2008), we can argue that in (7) the choice is made between the requirement for the NP complement to Num to be syntactically licensed through structural GQ,

<sup>(</sup>ii.a) are never used in formal register and are perceived as ungrammatical by my consultants and myself, they can be found in colloquial Russian. A possible explanation for this occurrence is that speakers that allow (ii.a) analyse the noun heading the NP complement to the numeral in (ii.b) as an optionally null classifier due to its invariable form (i.e., no other noun can be used with pluralia tantum).

<sup>5</sup>This rule refers to nouns that lack a lexically realised unit for counting. This includes mass nouns, collective nouns, pluralia tantum and countable nouns that lack a non-suppletive lexical singular form. Importantly, nouns like *deti* 'children' do not fall under this category despite having a suppletive singular form *rebjonok* 'child' in modern Russian. This is because the non-suppletive form *ditja* 'child' still exists in the language even though it is perceived as stylistically marked and somewhat archaic. The noun *ljudi* 'people', conversely, has never had a non-suppletive lexical singular form as it was historically derived from a collective noun, i.e., *ljud* 'people, folk' (Chumakina et al. 2004) that already lacked a unit for counting.

### Elena Titov

and the requirement for it to realise plural features at PF. The data in (7) suggest that syntactic well-formedness is a stronger requirement. That is, what we observe in (7b) is that a well-formed syntactic representation containing a structurally licensed NP is generated, but when this representation reaches PF, the latter fails to realize the genitive plural features on the defective noun (i.e., the noun that lacks a plural lexical form).<sup>6</sup>

### **2 The numeral-classifier construction**

The pattern observed in (7) breaks down in constructions involving modification or topicalization, see (9) and (10), creating an apparent counterexample to (8). That is, once a modifier interferes between the numeral and the noun, selecting an NP headed by a noun that lacks a singular lexical form becomes possible, as in (9b), in an apparent violation of (8), whereas using a noun that lacks a plural lexical form, as in (9a), is not acceptable to all native speakers of Russian.<sup>7</sup>

	- b. vosem' eight krasivyx pretty.gen.pl ljudej people.gen.pl 'eight pretty people'

Similarly, when the NP is topicalized, as in (10c), a noun lacking a singular lexical form is selected in an apparent violation of (8). A noun lacking a plural lexical

<sup>6</sup>The present analysis assumes a competition of syntactic and PF constraints, with syntactic constraints winning the competition. I do not propose an Optimality Theoretical account for this competition because I do not take syntactic constraints to be violable.

<sup>7</sup>Although Russian prescriptive grammars state that (9a) is ungrammatical, I have come across speakers that accept it. I have therefore used questionnaires in order to establish which form in (9) is more acceptable to native speakers of Russian (judged on the scale from 1 to 5, with 5 being fully grammatical and 1 fully ungrammatical). Out of forty-six native speakers questioned, four favoured (9a) and forty-two favoured (9b). Out of the group of speakers that favour (9a), two speakers clarified that since the phrase in (7a) is ungrammatical, it should be ungrammatical even in the presence of modification, while the other two speakers did not explain their preference. Out of the group of speakers that favour (9b), eight speakers found (9a) fully ungrammatical (in line with my own judgement as a native speaker of Russian), whereas the remaining thirty-four speakers found it marginally acceptable (none of them gave it a five or a four) but degraded with respect to (9b) (two speakers have independently suggested that (9a) is restricted to contexts involving contrast).

### 18 Number agreement mismatches in Russian numeral phrases

form, on the other hand, cannot be used in topicalized NPs (see (10b)) despite being chosen in the structure prior to topicalization (see (10a)).<sup>8</sup>

(10) a. V in komnate room bylo was.3sg.n vosem' eight čelovek. person.nom.sg 'In the room there were eight people.' b. ⁇ Čelovek<sup>1</sup> person.nom.sg. v in komnate room bylo was.3sg.n vosem' eight t1 . c. Ljudej<sup>1</sup> people.gen.pl v in komnate room bylo was.3sg.n vosem' eight t1 . 'As for people, there were eight of them in the room.'

The data in (9) and (10) present a challenge for (8). In particular, if the higher numeral assigns GQ to its NP complement and thus places the restriction in (8) on it, (9b) should be impossible, as it seemingly contains a syntactically unlicensed NP. Similarly, in (10c) the topicalized NP is expected to reconstruct but it cannot reconstruct into the position where it receives GQ, as in (10a), because reconstruction to this position of the NP headed by a noun that lacks a singular lexical form, as in (10c), violates (8). A logical solution for (10) would be to assume that the topicalized NP in (10c) reconstructs to some other position, where it receives some case other than GQ. If so, this position might also be the position that hosts the NP in (9b). Let us use this assumption as our working hypothesis and try to establish what this position is and what case is assigned to the NPs in (9b) and (10c) and by what head.

As a starting point let us look at (11). We have hypothesised in (8) that a noun lacking a unit for counting cannot head an NP that receives GQ. We have based this hypothesis on (7a) but we expect it to apply to any noun that lacks a unit for counting, including mass nouns. This prediction is indeed borne out in (11).<sup>9</sup> It is nevertheless possible to express the meaning of (11) with a grammatical sentence as long as the NP headed by a mass noun receives genitive or partitive case from the head of the NP that receives GQ from the numeral, as in (12).

(11) \* Na on stole table stojalo stood.3sg vosem' eight čaja tea.gen / čaju. tea.part

<sup>8</sup> (10b) is marginally acceptable under the interpretation of approximate inversion (although this word order still feels like resulting from a production error) but not under the interpretation and intonation associated with the topicalization of the NP.

<sup>9</sup>The ungrammaticality of (11) cannot be due to the lack of plural agreement with the higher numeral, as such a violation is tolerated in (7b).

### Elena Titov

(12) Na on stole table stojalo stood.3sg vosem' eight stakanov glasses.gen.pl čaja tea.gen / čaju. tea.part 'There were eight glasses of tea on the table.'

The assignment of GQ is possible in (12) because the NP that receives it is headed by a countable noun that has both lexical number forms. The availability of a singular lexical form ensures that there is no violation of (8), while the availability of a plural lexical form allows for the realisation of the genitive plural features; see (13).

In (13), the mass noun that cannot head NP<sup>1</sup> , which receives GQ from the numeral, can nevertheless head NP2, which is contained in the NumP and c-commanded by the numeral. The crucial hypothesis that I would like to put forward is that the same strategy is used in (9b) and (10c), as shown in (14).

In (14) the NP<sup>2</sup> headed by the noun that lacks a singular lexical form receives genitive plural from a phonologically null qantifying expression (QE) that heads NP<sup>1</sup> carrying GQ.<sup>10</sup> The questions that will be addressed in this section

<sup>10</sup>The idea that numeral phrases may contain phonologically null nouns has also been proposed in Kayne (2005).

### 18 Number agreement mismatches in Russian numeral phrases

are the following. What is the nature of the QE in (14)? Can it be overt? What licenses its covert status?

I would like to propose that the head of NP<sup>1</sup> in (14) is the lexical variant of the noun 'person/people' that only has a singular lexical form, as in (15). (In the following examples, small caps mark the focus of the sentence.)

(15) [Krasivyx pretty.gen.pl ludej]<sup>1</sup> people.gen.pl v in komnate room bylo was.3sg.n vosem' eight (čelovek) t<sup>1</sup> . person.nom.sg 'As for pretty people, there were eight of them in the room.'

The structure for (15) is given in (16). This construction has been referred to in the linguistic literature as the numeral-classifier construction (NCC) (see Sussex 1976, Yadroff 1999 and Pesetsky 2013). It is forced in structures with approximate inversion involving modification of the type *čelovek pjat' krasivyx ljudej* 'approximately five pretty people'.<sup>11</sup> Following Yadroff (1999), I assume that the

	- b. muzykantov musicians.gen.pl pjat' five 'approximately five musicians'
	- b. \* muzykantov pjat' talantlivyx
	- c. \* talantlivyx pjat' muzykantov
	- d. \* talantlivyx muzykantov pjat'
	- e. \* muzykantov talantlivyx pjat'

<sup>11</sup>In the absence of modification, inversion can take place in a structure that does not contain the QE *čelovek;* see (i). However, if the noun is modified, any type of movement to pre-numeric position – be it just the noun inverted, as in (ii.b), just the adjective inverted, as in (ii.c), or both words inverted, as in (ii.d) and (ii.e) – is ungrammatical. In this case, the structure in (16) with the inverted pleonastic noun *čelovek* must be used, as in (iii) (see also Mel'čuk 1985 and Yadroff 1999).

### Elena Titov

QE in constructions of the type given in (16) is not a normal noun but a classifier and assign it to a category that Yadroff calls Measure. As can be seen from (15) and (16), the QE that heads the MeasureP can be overt. The option of being covert, on the other hand, is plausibly licensed by the limited semantic function and the semantic recoverability of the QE. To be precise, the QE in (16) has no other semantic function but to pick out a certain number of individuals from the set represented by its NP complement.<sup>12</sup>

The set denoted by the NP is a subset to the set denoted by the QE. In other words, the set denoted by the NP interpretively restricts the set denoted by the QE. Consequently, the QE consistently represents the superset to the set represented by its NP complement. Plausibly, the default superset construal is one of the factors contributing to the semantic recoverability of the QE. However, as we will see in §4, this is not a sufficient factor and additional restrictions on semantic recoverability apply.

If we are right in assuming that the interpretation of the superset to the set represented by the NP is a crucial factor for the semantic recoverability of the QE, we expect that when *čelovek* does not take an NP complement, it must be overt and the set it represents is unrestricted. This is indeed the case in (7b), where

	- b. pjat' five krasivyx pretty ljudej people.gen.pl 'five pretty people' (not: 'approximately five pretty people')

<sup>12</sup>If the QE is allowed to be covert due to its limited semantic function, we expect that when it performs an additional semantic function, it must be overt. This is indeed the case in structures involving approximate inversion, where the QE cannot be covert; see (i).

### 18 Number agreement mismatches in Russian numeral phrases

*čelovek* takes no NP complement. It therefore refers to an open set of people and is obligatorily overt.

The analysis in (16) entails that Russian higher numerals consistently assign GQ to their complements and that (8) always holds, whereas the NP headed by the noun *ljudi* never ends up in the position receiving GQ. Instead, this NP is consistently selected by an optionally null QE that assigns genitive plural to it. This assumption captures the problematic data in (9) and (10). Yet, the reader might wonder why the structure in (16) is not used in (7a), which should make it well formed. I would like to argue that the structure in (16) is indeed available for (7a) but employment of this structure results in semantic oddness. Indeed, the structure in (7a) is as semantically odd as the one in (17), where the QE is overt, because in both examples the open set of people represented by the QE (covert or overt) is not restricted by a more specific subset of people denoted by its NP complement. The NP is interpreted as referring to an open set of people but an open set of people is already denoted by the QE. We have argued that the QE can only take an NP complement that restricts its set. That is, given that in (16) the QE denotes an open set of people, the NP must refer to a set of people with some specific features or qualities, such as 'pretty people' in (9b). Whenever the QE takes an NP complement that represents exactly the same open set, this results in redundancy and subsequent semantic oddness; see (7a) and (17).<sup>13</sup>

	- b. (čelovek) person.nom.sg vosem' eight hobbitov hobbits.gen.pl i and ljudej people.gen.pl '(approximately) eight hobbits and people'

<sup>13</sup>As expected, (7a) improves when the set represented by the QE is semantically restricted, as in (i). The acceptability of (i) strongly suggests that (7) cannot be accounted for by assuming a morpho-phonological constraint that bans linear adjacency between *vosem'* and *ljudej.* Furthermore, linear adjacency is possible in a coordinate structure with the interpretation 'a group of (approximately) 8 individuals some of which are men and some hobbits'; see (ii.a). As can be seen from (ii), when the QE selects a coordinate NP that represents two sets – a set of people and a set of hobbits, no semantic oddness obtains because the set denoted by the QE is restricted by a more specific subset of hobbits.

Elena Titov

(17) \* vosem' eight čelovek person.nom.sg ljudej people.gen.pl

Crucially, whenever the NP that is complement to the QE is topicalized, as in (10c), semantic oddness disappears, strongly suggesting that the topicalized NP refers to a more specific set than the one denoted by the QE. In the next section, we discuss the nature of this set and discover why the structure in (16) is obligatory for (10c).

### **3 The plurality requirement**

We have argued that modification makes it possible for higher numerals to take MeasureP complements headed by an optionally null classifier that in turn takes an NP complement that can be headed by the noun *ljudi;* see (9b) and (16). We have maintained that this option is determined by the semantics of the NP. In particular, the NP must restrict the set denoted by the classifier. In the absence of such a restriction, the NCC cannot be formed (see (7a) and (17)), whereas modification makes such a restriction possible. At the same time, we have seen that the structure with *čelovek* in (9a) is acceptable to some speakers but not others (see footnote 7). Let us consider the grammar of both types of speakers. Plausibly, speakers who (like myself) find (9a) ill formed interpret the noun *čelovek* in (9a) as a classifier due to its impoverished morphological form. This is because nouns that have both lexical number forms surface in the nominative singular form when used as classifiers (see (18a)) but in the plural genitive form required for agreement with the higher numeral when used as heads of NPs (see (18b)). When the noun is nominative singular and hence construed as a classifier, modification is impossible (see (18c)) in line with the observation that classifiers generally resist modification. By hypothesis, speakers of my variety transfer the classifier analysis to any noun that surfaces in the nominative singular form in the context of a higher numeral and analyse (9a) in parallel with (18c).

	- b. vosem' eight polnovesnyx full-weight.gen.pl kilogrammov kilograms.gen.pl 'eight full-weight kilograms of apples'
	- c. \* vosem' eight polnovesnyx full-weight.gen.pl kilogramm kilogram.nom.sg

### 18 Number agreement mismatches in Russian numeral phrases

The fact that nominative singular classifiers generally resist modification is plausibly due to a *φ*-feature conflict that results from the adjective realising the case and number features required for agreement with the higher numeral and the classifier being unable to realise them, as in (19).<sup>14</sup>

Since the adjective in (9a) and (19) is part of the NP that enters into an agreement relation with the numeral, it must realise the genitive plural features. Incidentally, no other morphological form of the adjective but genitive plural can surface in NPs receiving GQ from a higher numeral.<sup>15</sup> The classifier, conversely, surfaces in what appears to be the default nominative singular form. This, in turn, generates a conflict within the NP resulting from a mismatch in the case and number features between the head and the modifier; see (19).<sup>16</sup> Plausibly, it is this mismatch that results in the ill-formedness of (9a) for speakers of my variety. Naturally, a

(i) vosem' eight čelovek person.nom.sg s with krasivymi pretty licami faces 'eight people with pretty faces'

<sup>14</sup>The ungrammaticality of (9a) cannot be due to modification as such, as modifiers that do not enter into an agreement relation with *čelovek* can surface in this type of construction; see (i) below.

<sup>15</sup> Unlike Serbo-Croatian, Russian does not have uninflected 'indeclinable' modifiers.

<sup>16</sup> It appears that the crucial violation here is the case feature mismatch, as a number feature mismatch is tolerated in Russian NPs that are complements to lower numerals. Pesetsky (2013) accounts for the number feature mismatch found in contexts of paucals by assuming that the adjective merges with N or a projection of N and agrees with the closest number-bearing element, which is the [–singular] paucal. The noun, on the other hand, enters syntax bearing no number feature (NBR) and immediately merges with the paucal, which is a free-standing instance of NBR rather than a numeral. As a result, the adjective is [−singular], whereas the noun is not specified for the [−singular] feature.

### Elena Titov

structure with a plural noun, as in (9b), does not suffer from a *φ*-feature conflict. However, the NP in (9b) cannot carry GQ as it is headed by a noun that lacks a singular lexical form; see (8). Hence, the NCC in (16) must be formed for (9b). To rephrase, (16) is licensed by the plurality requirement placed on the noun by the adjective in my variety of Russian.<sup>17</sup>

Conversely, speakers that accept (9a) must be insensitive to the aforementioned *φ*-feature conflict. This might be because, even in the absence of modification, such NumPs involve a *φ*-feature violation that is tolerated, i.e., the noun in (7b) does not realise the genitive plural features required for the agreement with the higher numeral. By hypothesis, insensitivity to the *φ*-feature conflict between the adjective and the noun allows these speakers to interpret *čelovek* as a full noun rather than a classifier despite its impoverished morphological form. If so, the structure in (4) is generated in the grammar of these speakers for the numeration in (9a), while the NCC in (16) is generated whenever the simpler structure in (4) is unavailable, as in (9b). We would, then, expect to find speakers that favour (9a) over (9b) due to its simplicity along with speakers that accept both structures to a certain degree but assign distinct contextual interpretations to them. This prediction appears to be borne out (see footnote 7).

Since for speakers of my variety, (9a) is ill formed due to a plurality requirement placed on the noun, which in turn triggers the structure in (16), it is not completely outlandish to assume that (10b) is ill formed for a similar reason. Namely, a plurality requirement is placed on the topic NP, which rules out the structure with a noun that lacks a plural lexical form. I would like to propose that the relevant plurality requirement follows from the interpretive properties of NumPs that contain a trace of a topic NP. Let us consider these properties. The sentence in (10c) has a typical Top/Foc structure, with the topic NP construed as a contrastive topic (CT) and the numeral constituting the narrow focus of the

<sup>17</sup>In the absence of a plurality requirement, the formation of the NNC is possible only when the QE is overt, as in (i) and (ii).

<sup>(</sup>i) čelovek person.nom.sg vosem' eight talantlivyx talented.gen.pl muzykantov musicians.gen.pl 'approximately eight talented musicians'

<sup>(</sup>ii) V in orkestre orchestra rabotajet work.3sg pjat' five čelovek person.nom.sg skripačej, violinists.gen.pl i and šest' six čelovek person.nom.sg duxovikov. wind-players.gen.pl 'In the orchestra work five violinists and six wind-players.'

### 18 Number agreement mismatches in Russian numeral phrases

sentence. Thus, (10c) most naturally occurs in a context that asks about the quantity of individuals present in the room and therefore licenses narrow focus on the numeral, as in (20). It is, however, incompatible with a context that licenses focus on the entire NumP, as in (21). (Sentences marked with '#' are grammatical but incompatible with the given context.)

	- A: Ljudej<sup>1</sup> people.gen.pl v in komnate room bylo was.3sg.n vosem' eight t1 . 'As for people, there were eight of them in the room.'
	- A: # Ljudej<sup>1</sup> people.gen.pl v in komnate room bylo was.3sg.n vosem' eight t1 . 'As for people, there were eight of them in the room.'

The question (20Q) can be answered by a simpler sentence that does not contain a CT; see (22).

	- A: V in komnate room bylo was.3sg.n vosem' eight čelovek. person.nom.sg 'In the room there were five people.'

However, the replies (20A) and (22A) are not only structurally different, their interpretation is also distinct: while (22A) merely answers the question about the quantity of people in the room, (20A) additionally conveys that people were not the only individuals present in the room that are relevant for the discussion at hand but they were the only individuals for whom the quantity (i.e., the focus value) is known. Since for other individuals in the room the quantity is unknown, the sentence is perceived as providing incomplete information. The interpretation of incompleteness is what characterizes the information-structural (IS) category of CT (Büring 2003), strongly suggesting that the topic NP in (20A)

### Elena Titov

and (10c) is a CT.<sup>18</sup> This conclusion is further supported by the observation that (10c) has the prosodic pattern typical of CT/Foc sentences, with the rising topic contour IK3 on the topicalized NP and the falling contour IK1 on the focused numeral (Bryzgunova 1971; 1981, Titov 2013).

The set introduced by the CT in (20A) and (10c) is a subset of a set of individuals that were present in the room. That is, even when the CT refers back to an identical discourse-antecedent, as in (20), the sentence itself activates the superset construal, as it conveys that just a subset of the set of individuals in the room that are relevant for the discourse at hand are people. This means that the superset for the set of people becomes salient at the point the sentence is uttered.

The above observation provides an answer to the question we posed in the previous section. Recall that while the sentences in (7a) and (17) are semantically odd because the QE in these examples takes an NP complement that represents exactly the same open set, the sentence in (10c) does not suffer from semantic oddness. We have suggested that this is because the topicalized NP refers not to an open set of people but to some other set that restricts the set introduced by the QE. Indeed, contrastive construal of the topicalized NP in (10c) results in the interpretation according to which this NP belongs to a contextually closed set of individuals that were present in the room, for some of whom the quantity is unknown. In other words, the CT in (10c) does not represent an open set of people but a subset of individuals that were present in the room. Plausibly, this contextual restriction of the set to which the NP belongs eliminates redundancy and semantic oddness that we observe in (7a) and (17).

Another crucial observation as regards the interpretive properties of (10c) is that the NumP here is obligatorily non-referential. This is because the verb here is in the default third person singular form. The availability of default agreement is due to NumPs in Russian being construed by syntax either as NPs or QPs (Pesetsky 1982). In the former case, the verb agrees with the nominative NP, as in (23a), and the NP allows for definite/specific reading, while in the latter case, agreement cannot take place and the QP is interpreted as a non-specific indefinite (see (23b)) (Titov 2012).<sup>19</sup>

<sup>18</sup>To be interpreted as a CT, the relevant NP must linearly precede the focus in Russian (Titov 2013).

<sup>19</sup>The fact that NumPs in sentences with default agreement cannot be referential is further supported by the observation that they cannot take an apparent wide scope typical of specific indefinites; compare (i) and (ii) below. While the sentence in (ii) allows for the reading where two specific students failed all of the exams, (i) can only mean that for each exam there were two students that failed it.

18 Number agreement mismatches in Russian numeral phrases

	- b. V in komnatu room vošlo entered.3sg (\*èti/ètix) these.nom/gen vosem' eight čelovek. person.nom.sg

We have seen that the sentence in (10c) has narrow focus on the numeral. Plausibly, this IS partitioning forces syntax to interpret the NumP as a QP rather than an NP, as the sentence in (10c) cannot contain an agreeing verb (see (24)), resulting in the obligatorily non-specific indefinite construal of the NumP (see (25)).


Due to the non-specific construal, the NumP in (10c) cannot refer to a specific set of eight people. Instead the focused numeral selects a subset of eight people from the set introduced by the CT (i.e., the NP), strongly suggesting that we are dealing with the so-called set partitive interpretation of the NumP.<sup>20</sup> Given that only NPs that can denote sets of entities are allowed in set partitives, such NPs must contain plural nouns (de Hoop 1997). Hence, it is the set partitive construal of the NumP that places a plurality requirement on the topic NP in (10c), rendering (10b) ungrammatical.<sup>21</sup>

It has been suggested that the quantifier in partitive constructions is followed by an empty noun (Milner 1978, Bonet & Solà 1986, Abney 1987, Hernanz & Bru-


<sup>20</sup>Numerals cannot occur in entity partitives.

<sup>21</sup>Following Barker (1998), I assume that partitives are anti-uniqe. Due to anti-uniqueness, partitives are inherently non-specific indefinites, resulting in DP partitive constructions being unable to be headed by a definite determiner. The data in (25) can be seen as supporting this idea.

### Elena Titov

cart 1987, Delsing 1988; 1993, Ramos 1992, Cardinaletti & Giusti 1992; 2006, Sleeman 1996, Doetjes 1997, Barker 1998, Brucart & Rigau 2002, Ionin et al. 2006). This assumption is motivated by the observation that a partitive construction of the type given in (10c) denotes two sets (here it is a general set of people and a set of eight people present in the room). The Catalan example in (26a), where *e* is lexically identical to *homes* 'men', illustrates this idea.

### (26) Catalan (Martí i Girbau 2010: 27)


In (26a), the partitive construction refers to two sets of men: the set of those men and the set of three men, the latter being a subset of the former.The NumP in (26b) has an overt noun inserted between the quantifier and the PP and is grammatical, albeit odd and redundant to a native speaker. The NumP in (26c) has an empty noun holding the final noun position. Overall, this is taken as evidence that an empty noun category should be posited to license a partitive meaning. In line with this observation, the present analysis assumes the structure in (16) for the partitive NumP in (10c), where an optionally null classifier occurs between the numeral and the genitive NP.

In this section, we have argued that a plurality requirement placed on a noun forces the structure in (16) whenever this noun lacks a singular lexical form and can therefore not head an NP that receives GQ from a higher numeral; see (8). Economy considerations predict that the more complex NCC is generated for NumPs that do not have an overt QE if and only if a plurality requirement forces plural features on the noun but the NP this noun heads fails to be generated in the complement to the numeral position, for instance because of (8). In this case, and this case alone, the simpler structure in (4) is not available for the given numeration. In all other cases, (4) is chosen by the grammar as the more economical structure.

### **4 Franks & House (1982)**

Further evidence for the NCC analysis comes from the data discussed in Franks & House (1982) that involve topicalization of an NP in the genitive plural form

### 18 Number agreement mismatches in Russian numeral phrases

that appears to take place from a position to which genitive singular is assigned; see (27).

	- b. \* Na on stole table bylo was.3sg.n dva two romanov. novels.gen.pl

The head of the numeral phrase in (27) is a lower numeral that consistently takes a genitive singular NP complement, as in (28a). A genitive plural NP cannot be licensed in the complement to lower numeral position; see (28b). Yet, while the topicalized NP *romanov* 'novels' in (27) carries a genitive case marker, it is, surprisingly, in a plural form. Franks & House maintain that the topic NP cannot have been extracted from the argument *dva* 'two' because the latter assigns the genitive singular, not the genitive plural. Hence, they propose that the genitive NP is an external topic that forms a constituent with a covert quantifier, which accounts for the genitive case marking. The overt quantifier raises at LF, licensing the null quantifier of the genitive constituent. However, as Franks & House point out, the genitive topic in (27) is different from other attested external topics in Russian (i.e., nominative topics) in that the former is not obligatorily followed by a pause. Moreover, the genitive topic in (27) requires a numeral in the clause that refers back to the genitive NP. This, of course, cannot be said about other external topics. And finally, Franks & House's analysis of the number agreement mismatch in (27) cannot be applied to the cases of number agreement inconsistencies discussed above that do not involve topicalization.

Hence, the NCC analysis appears to be better suited for (27). On this account, the sentence in (27) contains an optionally null QE whose semantic set is restricted by the topic NP, as in (29) and (30). The structure in (30), just like the one in (16), is licensed by two conditions: (i) the plurality requirement placed on the CT (i.e., NP) that moves out of a non-specific NumP with a set partitive construal, and (ii) the impossibility of reconstruction of the plural NP to the complement to Num position. In the case of (16), the latter condition results from (8). In the case of (30), it results from the fact that a lower numeral cannot take a plural NP complement; see (1) and (28b). Importantly, the generation of the more complex NCC is possible only when the two conditions prevent the generation of the simpler

### Elena Titov

structure in (4). In all other cases, economy rules out the NCC and the structure in (4) is used.

(29) Romanov<sup>1</sup> novels.gen.pl na on stole table bylo was.3sg.n dva two (toma) volume.gen.sg t1 . 'As for novels, there were two volumes on the table.'

By analogy with (16), the head of the MeasureP in (30) is optionally null. As the set represented by the QE is consistently a superset to the set introduced by its NP complement, the QE is semantically recoverable, in the sense that when it is null, it can be interpreted as representing any set of which the set denoted by the NP is a subset. In (30) the overt QE denotes a set of volumes on the table out of which a set of novels is a subset, but the set represented by the QE can be even more open and denote a set of books on the table out of which a set of novels is a subset, as in (31).

(31) Romanov novels.part/gen.pl na on stole table bylo was.3sg.n dve two knigi. book.gen.sg 'As for novels, there were two books on the table.'

Typically, the set represented by the QE is contextually specified, as in (32) where it is given as the superset in the contextual question. That is, depending on whether the items on the table out of which the set of novels is selected are books or different kinds of reading materials (e.g. novels, newspapers, magazines, journals etc.) or different kinds of unrelated items (e.g. novels, apples, plates, flowers etc.), the set can be as open as to include all inanimate entities, as long as the context (linguistic or extra-linguistic) warrants such a construal.

	- A: Romanov novels.part/gen.pl na on stole table bylo was.3sg.n dve two knigi, book.gen.sg

### 18 Number agreement mismatches in Russian numeral phrases

stixov poems.part/gen.pl (na on stole table bylo) was.3sg.n tri three (knigi), book.gen.sg a and slovarej dictionaries.part/gen.pl (na on stole table bylo) was.3sg.n četyre four (knigi) book.gen.sg 'There were two books of novels on the table, three (books of) poems and four (books of) dictionaries.'

It is, however, plausible that when the QE in (30) is phonologically null and the context does not specify the nature of the set it denotes, it is interpreted as representing the most open set out of which the set denoted by its NP complement is a subset. We have seen that the most open superset for individuals is a set of people represented by the noun *čelovek*. Similarly, in (30) the most open superset for the set of inanimate entities is the set of items, represented by the noun *štuka*, as in (33).<sup>22</sup>

(33) Romanov novels.part/gen.pl na on stole table bylo was.3sg.n dve two štuki. item/thing.gen.sg 'As for novels, there were two items on the table.'

In (33), the noun *štuka* 'item/thing' selects a certain number of entities from a set of novels in exactly the same fashion as the noun *čelovek* 'person' selects a number of individuals from a set of pretty people in (15) so that the only difference in the construal of the QEs in (33) and (15) lies in the features [±animate] and [±human].<sup>23</sup> In other words, *štuka* represents the most open set of entities, while *čelovek* denotes the most open set of individuals. Plausibly, in the absence of a contextual disambiguation, the null QEs in NCCs are interpreted as referring to these open sets.

<sup>22</sup>The QE in (31)–(33) cannot be phonologically null when Num carries feminine gender features required for agreement with the feminine MeasureP. When the QE is null, Num agrees in gender with the masculine NP. As the MeasureP and the NP in (31)–(33) have distinct gender features, the constructions with an overt and a covert QE have distinct agreement features on the numeral.

<sup>23</sup>Yadroff (1999) analyses the nouns*štuka* and *čelovek* used in NCCs as pleonastic noun classifiers. He argues that the class of classifiers found in NCCs is closed, with *štuk* 'items.gen.pl'replaced with *èkzempljárov* 'copies.gen.pl' in formal register, and *čelovek* 'person.nom.sg' replaced with *duš* 'souls.gen.pl' in archaic texts. However, as can be seen from (29)–(32), it is possible to have other nouns performing the role of the QE as long as they represent a superset to the set denoted by the NP complement. Just like any other classifier mentioned by Yadroff, the QEs in (29)–(32) can occur in a construction involving approximate inversion, as in (i).

<sup>(</sup>i) knig books.gen.pl / tomov volumes.gen.pl pjat' five istoričeskix historical.gen.pl romanov novels.gen.pl 'approximately five books/volumes of historical novels'

### Elena Titov

Above, we mentioned that the superset construal of the QE is not sufficient for it to remain null and that additional restrictions on semantic recoverability apply. To be precise, while QE in (30) can remain covert in structures involving contrastive topicalization, as in (27), in the absence of topicalization, it must be overt; see (34).

	- b. \* Na on stole table bylo was.3sg.n dva two romanov. novels.part.pl

Plausibly, the option of remaining covert in (27) is due to the IS partitioning of the non-referential NumP into focus on the Num and CT on the NP, which results in a set partitive construal of the NumP, which in turn requires the presence of the QE (null or overt) in order for the partitive construction to denote two sets. It follows, then, that partitive construal itself presupposes the NCC containing the QE. Conversely, in (34b), it is impossible for the NumP to have the corresponding CT/Foc partitioning because the NP does not move across the numeral (Titov 2013). Hence, in the absence of contrastive topicalization, the QE must be overt. Yet, there is one exception to this rule, i.e., the QE can stay covert and be recovered when it refers to the same set as denoted by the head of its NP complement, as in (9b) where both heads select out of a set of people; see (16). This rare occurrence is due to the deficient lexical number forms of the two nouns, which allows them to co-occur as long as there is a restriction of the set represented by the QE by the set denoted by its NP complement that can be achieved either via modification or topicalization. Since both heads in (16) denote the same set, the referent of the QE is recoverable from the referent of the head of the NP.

### **5 Conclusion**

In this paper, we have discussed two types of number agreement mismatch in Russian numeral phrases. We have proposed a unified syntactic account for both phenomena that assumes the NCC in (16) and (30) where the plural NP is a complement to an optionally null QE. We have argued that the structure is forced by a plurality requirement placed on the head of the NP, and either the selectional requirement of a lower numeral, as in (27), or by (8), as in (9) and (10). We have maintained that the optionally covert status of the QE results from its limited semantic function, and its semantic recoverability. The latter obtains in two cases,

### 18 Number agreement mismatches in Russian numeral phrases

the most common of which involves contrastive topicalization and partitive construal that results in the salience of the set represented by the QE. The other case is restricted to nouns that lack one of the lexical number forms, in which case the referent of the QE is identical to the referent of the head of its NP complement, allowing for its semantic recoverability.

### **Abbreviations**


### **Acknowledgements**

Many thanks to the audience of FDSL 12 and the anonymous reviewers for useful comments on the material presented here. I would also like to thank my language consultants for grammaticality judgments and the editors for their invaluable work.

### **References**


### Elena Titov


18 Number agreement mismatches in Russian numeral phrases


### Elena Titov


### **Chapter 19**

## **Russian case inflection: Processing costs and benefits**

### Maria D. Vasilyeva

Lomonosov Moscow State University

Mechanisms underlying the processing and storage of morphological case are still debatable in psycholinguistics. The key questions concern the nature of the special status of the nominative, the homogeneity/heterogeneity of oblique case forms, the impact of case syncretism and paradigmatic relations on nominal processing and the organization of the mental lexicon. We investigate these issues turning to Russian nominal processing. We performed two experiments with feminine and masculine nouns in different cases (experiment 1: nouns in singular, experiment 2: nouns in plural) using the visual lexical decision task. In this task, we measure the speed and accuracy with which the participant classifies sequences of letters as words or non-words. Evidence from both experiments indicates that differences in processing exist not only between the nominative and the other case forms, but also among the obliques. Experiment 1 points to the influence of wordform and exponent ambiguity, while experiment 2 reveals effects that are specific for case per se. We discuss the role of zero vs. overt phonological form, grammatical features, (non-)accidental homonymy, context, frequency, inflectional and relative entropy in case recognition.

**Keywords:** Russian, case, processing, experiment, lexical decision task

### **1 Introduction**

The role of frequency and regularity in processing of inflectional morphology has for long been of utmost concern for psycholinguists. Meanwhile, it is still not clear whether grammatical features that an inflectional marker conveys play an additional role in wordform processing. For instance, if we are speaking about nouns, a natural question to ask is how case influences nominal recognition.

Maria D. Vasilyeva. 2018. Russian case inflection: Processing costs and benefits. In Denisa Lenertová, Roland Meyer, Radek Šimík & Luka Szucsich (eds.), *Advances in formal Slavic linguistics 2016*, 427–453. Berlin: Language Science Press. DOI:10.5281/zenodo.2545543

### Maria D. Vasilyeva

Studies of isolated wordform processing suggest that nominative wordforms are processed faster than other case forms (see, e.g. Lukatela et al. 1978 for Serbian; Niemi et al. 1994 for Finnish; Abulizi et al. 2016 for Uyghur; Gor et al. 2017 for Russian). Yet, there is no uniform explanation of this fact. Likewise, it is debatable whether oblique cases entail equal processing costs or not.

Finnish and Uyghur researchers provide only the pooled mean for all the inflectional variants, comparing it to the nominative and do not inspect contrasts between oblique forms, though they usually use more than one oblique case in their experiments (Niemi et al. 1991; 1994; Hyönä et al. 1995; Laine & Koivisto 1998; Laine et al. 1999; Abulizi et al. 2016). As the nominative has zero inflection in these languages, oblique processing cost is attributed to a morphological decomposition procedure that is obligatory for inflected obliques, but absent in the non-inflected nominative.

This explanation is unsatisfactory for several reasons. Firstly, the nominative advantage disappears when case forms are embedded in context (Bertram et al. 2000; Hyönä et al. 2002). Bertram et al. (2000) and Hyönä et al. (2002)suggest that oblique processing disadvantage in a context-less environment arises not due to the decomposition cost, but precisely due to the lack of an appropriate context. Yet, they do not examine if all oblique cases suffer from the lack of context or benefit from its presence to the same extent. Secondly, processing of zero inflection receives a benefit in recognition speed only if the zero is associated with the nominative, but not with an oblique case (Gor et al. 2017). Finally, phonologically zero and overt nominatives appear not to differ in processing speed (see, e.g., Lukatela et al. 1980 for Serbian; Gor et al. 2017 for Russian). Thus, it is not the zero inflection that makes Finnish and Uyghur nominative wordforms special, but the the nominative case itself.

Early Serbian studies did compare processing of oblique cases, but mainly failed to find significant differences in response latencies (Lukatela et al. 1978; 1980; 1987; Katz et al. 1987; Kostić & Katz 1987; Feldman & Fowler 1987). These results, starting with Lukatela et al. (1980), were analyzed within the satellite model. The nominative form represents the nucleus of the nominal paradigm, while oblique case forms surround it as satellites. Satellites are assumed to be equidistant from the nucleus (Feldman & Fowler 1987). Deviations from the predictions of this model were attributed to specific experimental settings in case of nouns (Feldman & Fowler 1987; Todorović 1988); differences in adjectival case processing were assumed to rely on different mechanisms (Kostić & Katz 1987). However, not more than three case forms belonging to one number were compared at once. It is likely that some effects that could show up in a more elaborate

### 19 Russian case inflection: Processing costs and benefits

design were obscured. Moreover, ambiguity of case forms that is present in Serbian declension did not receive enough attention.

According to subsequent Serbian studies, processing speed of a wordform correlates positively with the number of syntactic functions/meanings that its inflectional ending encompasses (Kostić 1991; 1995; Kostić et al. 2003; Filipović Đurđević & Kostić 2003; Ševa & Kostić 2003), which hints at oblique processing differences. The proposed methodology of calculating syntactic functions/meanings is not flawless, since the authors bring under this umbrella term both syntactic notions such as subject or complement and semantic notions such as instrument or goal. Furthermore, it is not taken into account that lexemes in the same case have different probabilities of expressing the same thematic role, e.g. 'girl-ins' is less likely to be an instrument than 'hammer-ins'. Likewise, as all homonymous forms are treated equally, differences between accidental and non-accidental ambiguity is disregarded.

Later on, this problematic measure was abandoned, and the focus was shifted to paradigmatic relations between wordforms captured by inflectional and relative entropy measures (see, e.g. Milin et al. 2009). The inflectional entropy *H*(*P*) reflects the amount of information associated with the inflectional paradigm of the target lexeme (see (1)), where *f* stands for frequency, the wordform *w<sup>i</sup>* belongs to the paradigm *P* of a lexeme *w*) and correlates negatively with response latencies: when a lexeme has a higher value of the inflectional entropy, its wordforms are processed faster, and vice versa (Moscoso del Prado Martín et al. 2004). The relative entropy *D*(*IP* ||*IC*) captures the divergence between the frequency distribution of the target lexeme *w* and the frequency distribution of its inflectional class *IC* (see (2)), where *e<sup>i</sup>* stands for inflectional exponent), and it correlates positively with response latencies: wordforms belonging to paradigms with higher values of relative entropy are processed more slowly (Milin et al. 2009). When surface and lemma frequency combined with entropy measures are taken into account, case differences appear to play no additional role (Milin et al. 2009); yet, this claim was made on a small subset of wordform: wordforms in -*u* 'acc.sg' and -*e* 'gen.sg'/'nom/acc.pl' for feminine nouns, wordforms in -*om* 'ins.sg' and -*u* 'dat/loc.sg' for masculine nouns.

(1)

$$H(P) = -\sum\_{\mathbf{w}\_l \in P} \frac{f(\mathbf{w}\_l)}{f(\mathbf{w})} \log\_2 \frac{f(\mathbf{w}\_l)}{f(\mathbf{w})}$$

### Maria D. Vasilyeva

(2)

$$D(IP||IC) = \sum\_{\mathbf{w}\_l \in P} \frac{f(\mathbf{w}\_l)}{f(\mathbf{w})} \log\_2 \frac{f(\mathbf{w}\_l)/f(\mathbf{w})}{f(e\_l)/f(e)}$$

Another viewpoint predicting differences in oblique case processing and paying attention to wordform ambiguity was developed primarily by Clahsen et al. (2001). They adopted minimalist principles in morphology (see, e.g., Wunderlich 1996), suggesting that in the mental lexicon, the meaning of a case exponent is represented as a set of binary features. Non-accidental ambiguous inflectional markers receive underspecified representations. Along with this "natural" underspecification, radical underspecification is assumed to be present as well: only positive values are stored in the mental lexicon, while negative ones are deduced from paradigmatic oppositions. Hence, a direct implication for the psycholinguistic models of wordform processing arises. The number of specified (positive) features should determine the processing ease: the more information a form carries, the longer it takes to be recognized.

Main evidence supporting this claim comes from studies on German adjectival declension. Adjectival case forms with more specified representations are recognized slower in the lexical decision task (Clahsen et al. 2001). Such case forms show reduced priming effects under cross-modal priming if the adjective serving as a prime does not share all the positive features with the target (Clahsen et al. 2001). Similar priming effects are to a certain extent replicable even with highly proficient L2-German speakers (Bosch & Clahsen 2016; Bosch et al. 2017). As far as sentence processing is concerned, when an ungrammatical sentence contains an adjective or a determiner that is compatible with the context by its feature specification, this does not lead to an ungrammaticality effect in a sentence matching task, observed for ungrammatical sentences where specificity is violated, i.e. when the feature set of the wordform mismatches context requirements (Penke et al. 2004). These two types of ungrammatical sentences result in distinct ERP responses (Opitz et al. 2013).

If the radical underspecification hypothesis is true, the same principles should hold for nominal case inflection in other languages as well. Yet, prior studies on case processing shed doubts on its tenability, and additional evidence is needed.

### **2 Present study**

The present study aims to verify whether case form processing is determined by the grammatical features, nominative vs. oblique dichotomy, or context. We

### 19 Russian case inflection: Processing costs and benefits

addressed this issue in two lexical decision task experiments employing Russian data: experiment 1 with singular nouns and experiment 2 with plural nouns.

Russian was not chosen incidentally, but due to its particular pattern of case syncretism (convergence of inflectional exponents in different paradigmatic cells). Russian has six major cases and several inflectional classes of nouns. We will restrict ourselves to inanimate nouns and discuss only two most productive inflectional classes (Wiese 2004): feminine nouns with the nominative ending -*a* and masculine nouns with the nominative ending -. As is evident from Table 1, the two classes of nouns choose uniform endings in plural (except for genitive), but behave differently in singular.

Table 1: Russian case endings for the two most productive inflectional classes


### **2.1 Context**

Presenting case forms in isolation, we can test whether all oblique case forms rely equally on the context. Russian data is particularly suitable for resolving this issue, as there is a special case in Russian, namely locative (also called prepositional), which, unlike other cases, is always governed by a preposition. If the context is crucial for efficient oblique case recognition, locative wordforms should be processed longer compared to other oblique cases, since the latter do not need any preceding context on the left (e.g., if they occur at the beginning of a sentence). This hypothesis is partly supported by Vasilyeva et al.'s (2014) finding: masculine locative singular wordforms are processed as slowly as pseudowords with the same syllabic structure, and they are often qualified as nonwords. However, in the singular form, this processing cost could be caused by the homonymy of -*e* 'loc.masc' with -*e* 'dat/loc.fem'. If the effect is induced by the lack of prior preposition activation, locative plural processing should also be impaired. If locative plural processing is not more difficult than processing of other obliques, difficulty of masculine locative singular can not be explained by the absence of an

Maria D. Vasilyeva

appropriate preposition alone and, in general, the context-based hypothesis is not tenable.

### **2.2 Plural: case features vs. exponent frequency**

As oblique plural exponents are non-ambiguous, these data are fruitful for exploring the role of case in nominal processing. If surface and lemma frequency are accounted for and differences in oblique case processing still arise, they could be attributed either to exponent frequencies or to the set of grammatical features associated with the particular case. Frequency counts provided in Samojlova & Slioussar (2014) suggest a hierarchy in (3a). Different approaches to Russian declension employ different sets of features and, thus, give conflicting predictions see (3b)–(3e), where we arrange oblique cases according to the number of positive features they express (as suggested by Clahsen et al. 2001).

	- b. Müller 2004: Loc < Dat ≈ Ins < Gen (⟨+obl⟩ < ⟨+obl, +gov⟩ ≈ ⟨+obl, +subj⟩ < ⟨+obl, +gov, +subj⟩)
	- c. Wiese 2004: Loc < Ins ≈ Dat ≈ Gen (⟨+obl⟩ < ⟨+obl, +inst⟩ ≈ ⟨+obl, +dat⟩ ≈ ⟨+obl, +gen⟩)
	- d. Wunderlich 1996: Gen < Dat < Loc < Ins (⟨+hr*<sup>N</sup>* ⟩ < ⟨+hr, +lr⟩ < ⟨ +hr & additional semantic features ⟩ < ⟨ semantic features⟩) )
	- e. Caha 2008: Gen < Loc < Dat < Ins

### **2.2.1 Zero oblique inflection**

Gor et al.'s (2017) study demonstrated that oblique overt and zero inflection trigger similar processing costs. But their conclusion was based on the comparison of feminine - 'gen.pl' to masculine -*a* 'gen.sg'. A comparison with masculine -*ov* 'gen.pl' is needed to support their claim.

### **2.2.2 Nominative ambiguity**

Feminine *-y* 'nom.pl' coincides with 'gen.sg'. According to the approach advocated by Kostić (1991), etc., such ambiguous wordforms should benefit from their wider syntactic distribution and be recognized faster than their unambiguous masculine counterparts -*y* 'nom.pl'.

### 19 Russian case inflection: Processing costs and benefits

### **2.3 Singular: case syncretism**

Even if we obtain no significant differences in plural oblique processing, differences in singular oblique processing might arise due to ambiguity. Comparing instrumental wordforms, which are non-ambiguous, to other obliques, we can determine how interparadigmatic and intraparadigmatic syncretism influences wordform recognition. Furthermore, comparisons of wordforms with the same exponents, but belonging to different inflectional classes might help to resolve the debates concerning accidental vs. non-accidental homonymy in Russian singular declension.

If the two -*u*-s are accidentally homonymous (Wiese 2004), 'acc.fem' is expected to be processed faster than'dat.masc'. If the two -*e*-s are accidentally homonymous (Müller 2004), 'dat./loc.fem' is expected to be processed faster than 'loc.masc'.

We also decided to compare -*e* 'dat/loc.fem' to -*u* 'dat.masc'. If the dative reading is dominant for -*e*, there should be no difference between these two conditions. Finally, we compared feminine and masculine instrumental wordforms. These endings are also used in adjectives of the respective gender, but their distribution is different: feminine -*oj* covers all oblique cases, while masculine -*om* is used in locative only. This difference might lead to an advantage of feminine instrumental over masculine instrumental.<sup>1</sup>

### **2.3.1 Zero nominative inflection**

Russian overt and non-overt nominative inflection (-*a* 'nom.fem'and -'nom.acc. masc') was already compared in Gor et al. (2017), and no difference was observed. However, their study employed an auditory lexical decision task, and it is unclear whether their results are modality-neutral.

### **3 Method**

### **3.1 Participants**

Ninety-six Russian native speakers, all right-handed (aged 17–25 years) were tested. Half of them participated in experiment 1, the other half in experiment 2.

<sup>1</sup>Since feminine genitive singular -*y* is homonymous with nominative plural, we did not compare it to the masculine genitive singular.

Maria D. Vasilyeva

### **3.2 Stimuli**

We used all six case forms of inanimate nouns belonging to two declensional classes (54 feminine nouns ending in -*a* and 54 masculine nouns ending in matched for lemma frequency). All stimuli were base nouns, they did not undergo any stem alternations and had fixed stress on the stem (the 1a inflectional class according to Zaliznyak 1977). Length in nominative differed from 4 to 6 (each group comprised one third of words with each length). 108 nouns with pseudoendings and 108 inflected pseudostems served as nonwords. In experiment 1, nouns were presented in singular; in experiment 2, in plural. Latin-square design was employed with the number of lists corresponding to the number of case forms.

### **3.3 Procedure**

Each participant was assigned to one of the six experimental lists and was tested individually. Experiments were run using DMDX software (Forster, Forster, 2003). Before the test phase (324 trials), participants received written instructions and performed a practice phase (20 trials). In each trial, participants had to decide whether the string of letters presented on the screen was a real Russian word or not. They were instructed to respond as fast and accurately as possible. Each trial started with a fixation sign (+) that was displayed on the screen for 600 ms. The stimulus remained on the screen until response or time-out (2500 ms). The interstimulus interval was set to 2500 ms.

### **3.4 Data analysis**

We used linear mixed-effects modeling for the analysis of reaction times and logistic mixed regression for the accuracy data (Baayen 2008). Statistical analysis was implemented in the package lme4 (Bates et al. 2014) in the statistical software R (R Core Team 2014). *T*-values, *z*-values, *p*-values, and standard errors were determined using the package lmerTest (Kuznetsova et al. 2017). Fixed and random effects were included only if they significantly improved the model fit in a backward stepwise model selection procedure. Models were selected using Chi-square log-likelihood ratio tests with regular maximum likelihood parameter estimation.

Subject and lexeme were treated as random effects. Lemma and wordform frequency, length in letters and syllables, mean Levenstein distance to the nearest 20 lexeme-neighbors, inflectional and relative entropy measures were addition-

### 19 Russian case inflection: Processing costs and benefits

ally included as covariates.2, <sup>3</sup>, <sup>4</sup> Trial order (*z* transformation on log numbers) was included to control for longitudinal task effects such as fatigue or habituation. All these covariates were log-transformed. To avoid multicollineriarity, all counts except for trial were transformed into 5 principal components, explaining 93.5% of variance (Baayen 2008). The first principal component (PC1) captured orthographic characteristics of the stimulus. The second component (PC2) was inversely related to frequency. The third component (PC3) was inversely related to relative entropy and positively related to inflectional entropy. Paired contrasts were carried out in the package lsmeans (Lenth 2016). For planned comparisons, FDR adjusted p-values are reported (Benjamini & Hochberg 1995).

As our words were presented without context, case labels for ambiguous endings (feminine -*y* and -*e*, masculine -*y* and -) are somewhat arbitrary. Hence, we do not expect any differences between feminine locative and dative -*e*, nor between masculine nominative and accusative -. However, this is needed for counterbalancing issues, as patterns of syncretism do not coincide across our two noun groups. In the statistical analysis, the mean pooled over the two "conditions" will be used.

### **4 Results**

Two participants in experiment 1 and two participants in experiment 2 gave fewer than 75% correct answers to word stimuli, so we recruited four additional people to replace them. We excluded from further statistical analysis two lexemes in experiment 1 and one lexeme in experiment 2 due to low mean accuracy score.

Reaction time (RT) data were analyzed as follows. Incorrect responses were removed from the analysis (7.1% of all data in experiment 1, 7.9% in experiment 2). Too fast (< 300 ms) or too slow responses (> 1 500 ms) were likewise excluded from further analysis. We applied log-transformation to reduce the positive skew.

<sup>2</sup>Lemma frequency was taken from the frequency dictionary (Lyashevskaya & Sharoff 2009). Wordform frequency was manually extracted from the main undisambiguated subcorpus of Russian national corpora http://ruscorpora.ru; for ambiguous endings the cumulative frequency was taken, relying on Milin et al.'s (2009) experience. To avoid zero frequencies, one was added to all counts, as suggested by Brysbaert & Diependaele (2013).

<sup>3</sup>The Levenstein distance was calculated in the vwr package (Keuleers 2013) in the R software (R Core Team 2014).

<sup>4</sup> In order to calculate relative entropy, frequency of exponents was taken form the database created by Samojlova & Slioussar (2014).

### Maria D. Vasilyeva

After that, remaining outliers were cut off via interquartile trimming.<sup>5</sup> In sum, 3.9% of correct responses were removed in experiment 1 and 5.7% in experiment 2. Raw RTs and error rates (ER) are presented in Table 2.


Table 2: Mean RT (in ms) and ER (in %) to feminine and masculine nouns in different cases and numbers (*SD* is provided in brackets)

Final models for RTs and accuracy included the following factors: PC2, PC3, case, gender, a case by gender interaction and a PC3 by gender interaction. The model accounting for RTs in experiment 1 also included trial. All other predictors and interactions turned out to be insignificant. Full model specifications are presented in the Appendix (see Table 5 for experiment 1 and Table 6 for experiment 2).

### **4.1 Experiment 1: Singular**

Trial had a facilitative effect on RTs (*B* = −0.012, *t*(4541) = −4.45, *p* < .001). PC2 (inversely related to frequency) had a facilitative effect on RTs and accuracy rate (*B* = 0.018, *t*(121) = 6.03, *p* < .001, respectively and *B* = −0.364, *z* = −5.86, *p* < .001, respectively).

PC3 (entropy meausures) affected differently the two types of nouns (*B* = 0.024, *t*(103) = 3.2, *p* = .002 and *B* = −0.472, *z* = −3.628, *p* < .001, respectively):

<sup>5</sup>We kept only those RTs, which satisfied the following formula Q1 – (2.5 × IQR) < RT < Q3 + (2.5 × IQR), by participants, items (lexemes), gender and case (Q1 stands for first quartile, Q3 for third quartile, and IQR = Q3 – Q1 for interquartile range).

### 19 Russian case inflection: Processing costs and benefits

there was a facilitation for feminine nouns (*B* = −0.022, *t*(103) = −3.68, *p* < .001 and *B* = 0.509, *z* = 4.89, *p* < .001, respectively) and no effect on masculine nouns (*B* = 0.002, *t*(103) = 0.48, *p* = .633 and *B* = −0.11, *z* = 0.481, *p* = .631, respectively).

### **4.1.1 Paired contrasts for different cases**

Paired contrasts are summarized in Table 3 (for statistical details see Table 7). Apart from nominative vs. oblique differences, we observe differences between oblique case forms. Instrumental and the -*u* form ('acc.fem' and 'dat.masc') are "easy" obliques with faster responses and higher accuracy scores. The -*e* forms ('dat/loc.fem' and 'loc.masc') constitute the "difficult" oblique group with slower responses and lower accuracy scores. Feminine genitive falls into the "easy" group, while masculine genitive patterns with the difficult -*e* 'loc' form.<sup>6</sup>

Table 3: Experiment 1 (singular nouns): summary of paired contrasts analysis for feminine and masculine singular nouns in different cases


In the analysis of accuracy, in contrast to the RT data, we fail to observe the nominative superiority over instrumental in any noun group. What is more, according to the accuracy analysis, masculine genitive yields higher accuracy rate than the masculine locative.

### **4.1.2 Paired contrasts for gender**

(for statistical details see Table 7). There was no general gender effect either in the RT or accuracy analysis. Feminine -*e* forms ('dat/loc.fem') are recognized faster and more accurately than masculine -*e* forms ('loc.masc'). Feminine instrumental wordforms are recognized faster than masculine ones, but there is no effect in the accuracy analysis. Feminine nominative is responded to slower

<sup>6</sup> In Table 3 and further "<" stands for significantly faster or significantly more accurate responses, "≈" stands for no significant difference.

Maria D. Vasilyeva

than masculine - forms ('nom/acc.masc'); no effect shows up in the accuracy analysis. Feminine -*e* forms ('dat/loc.fem') are recognized less accurately than masculine -*u* forms ('dat.masc'), but there is no significant difference in the RT analysis. There is no significant difference between feminine and masculine -*u* forms ('acc.fem' and 'dat.masc', respectively).

### **4.2 Experiment 2: Plural**

PC2 (inversely related to frequency) had a facilitative effect on RTs and accuracy rates (*B* = 0.019, *t*(126) = 4.98, *p* < .001 and *B* = −0.285, *z* = −4.21, *p* < .001, respectively).

PC3 (entropy measures) affected differently the two types of nouns (*B* = 0.021, *t*(107) = 2.28, *p* = .025 and *B* = −0.423, *z* = −2.727, *p* = .006, respectively): there was a facilitation for feminine nouns (*B* = −0.022, *t*(111) = −2.91, *p* = .004 and *B* = .442, *z* = 3.7, *p* < .001, respectively), but no effect on masculine nouns (*B* = −0.0004, *t*(101) = −0.06, *p* = .949 and *B* = 0.019, *z* = 0.19, *p* = .847, respectively).

### **4.2.1 Paired contrasts for case**

Paired contrasts for case are summarized in Table 4 (for statistical details see Table 8). According to the RT analysis, we observe a tripartite division of oblique forms: locative as the easiest, dative in the middle and instrumental as the most difficult. Genitive is recognized significantly faster than instrumental, but differs neither from locative, nor from dative.

In the accuracy analysis, only two contrasts are retained: between the -*y* form ('nom/acc') and instrumental and the difference between the -*y* form ('nom/acc') and genitive.

Table 4: Experiment 2 (plural nouns): summary of paired contrasts analysis for plural nouns in different cases


### 19 Russian case inflection: Processing costs and benefits

### **4.2.2 Paired contrasts for gender**

(see Table 7 for statistical details). There was no difference between two groups of nouns either in the RT or accuracy analysis. In genitive, feminine nouns have higher odds to be recognized incorrectly than masculine nouns; no significant difference shows up in the RT analysis. Feminine and masculine -*y* forms ('nom/acc') differ neither in the RT analysis, nor in the accuracy analysis.

### **5 Discussion**

Results of our two experiments replicate the nominative/oblique dichotomy effect, previously reported for Russian and other languages (see §1). Apart from this trivial finding, we obtained several significant differences between oblique case processing both in singular and in plural. As we took into account lemma and surface frequency of a wordfom, such oblique case processing differences should stem from the properties of the inflectional exponents.

### **5.1 Inflectional and relative entropy**

We considered inflectional and relative entropy among potential covariates in the statistical analysis, as these factors were assumed to be highly predictive of nominal processing in Serbian (Milin et al. 2009). Prior to the analysis, we transformed our counts into principal components. PC3 capturing these two measures emerged in the statistical analysis of RTs and accuracy in both experiments. Unfortunately, the influence of PC3 was attested for feminine nouns only. The effect lies in the same direction as reported by Milin, Filipović Đurđević & Moscoso del Prado Martín (2009), but in the Serbian study masculine and feminine nouns were equally sensitive to entropy measures. However, to calculate the entropy values, they used frequencies of feminine exponents, as this inflectional class is assumed to be dominant in Serbian. In Russian, masculine - nouns are slightly more frequent than feminine -*a* nouns (Samojlova & Slioussar 2014) and, thus, might be considered dominant. However, as the patterns of syncretism in these two noun groups do not coincide, we decided against using dominant class frequencies and employed feminine frequencies for feminine nouns and masculine frequencies for masculine nouns. This decision might be a possible reason for the observed discrepancies with Milin et al. (2009), but a more refined study is needed in order to make more solid conclusions.

Maria D. Vasilyeva

### **5.2 Context: the locative issue**

Initially, we hypothesized that if absence of context is an important source of oblique processing cost, preposition-less locatives should suffer the most, both in singular and in plural. Although masculine locative singular was one of the most difficult forms to recognize, plural locative was processed faster than all other obliques. Thus, we conclude that context-based explanations do not receive support at least for wordforms with non-ambiguous case markers. Influence of context on forms with ambiguous case exponents will be discussed below.

### **5.3 Plural**

The hierarchy of plural case processing speed (4a) does not follow the order of exponent frequency: otherwise, instrumental would have been the easiest to process. So we can conclude that exponent frequency does not play a major role in the case form recognition. Nor does this hierarchy agree with the predictions derived from the frequency of exponents and feature sets proposed by Müller (2004); Wiese (2004); Wunderlich (2004). Interestingly, it roughly resembles Caha's (2008) nanosyntactic approach to Russian case, see (4b).

(4) a. -*y* < Loc < Dat < Ins, Loc ≈ Gen ≈ Dat, Gen < Ins (our data: exp. 2) b. [Ins [Dat [Loc [Gen [Acc [Nom]]]]]]

Here, the only diverging case is genitive. Unlike all other cases in plural, it is spelled out differently for our two target inflectional classes. Hence, at the checking or licensing stage (see, e.g., Bertram et al. 2000), which follows the decomposition of the wordform into morphemes, it is verified whether the inflectional class of the lexeme matches the inflectional class of the ending. For other oblique case forms, such a procedure is not needed, as they are uniform for both classes. As a consequence, we observe longer reaction times than those that could be expected if genitive plural meaning was expressed in only one way.

### **5.3.1 Zero oblique inflection**

In line with Gor et al. (2017), response latencies for the zero oblique 'gen.pl' did not differ significantly from the overt oblique -*ov* 'gen.pl'. Yet, the zero genitive yielded higher error rates than the overt genitive. We doubt that low accuracy stems from a greater processing cost associated with zero inflection compared to overt inflection, especially as this is not attested in the RT analysis. A more

### 19 Russian case inflection: Processing costs and benefits

plausible source for the high error rate is homonymy. Feminine genitive plural wordforms having a phonologically null ending are ambiguous with the stem itself. This homonymy might lead to a competition in the recognition process: if the wordform reading wins, the correct answer is produced in the lexical decision task; if the stem reading wins, non-word answer is selected, as Russian does not allow for bare stems.

### **5.3.2 Nominative ambiguity**

We failed to find evidence supporting the claim that the ambiguous feminine -*y* 'nom.pl'/'gen.sg' is easier to be recognized than the non-ambiguous masculine -*y* 'nom.pl' due to its wider distribution.

### **5.4 Singular**

In singular, the following generalization holds for both nouns:

(5) Nom < Ins ≈ -*u* < -*e*, , where -*u* corresponds to 'acc.fem'/'dat.masc' and -*e* corresponds to 'dat/loc.fem'/'loc.masc'

### **5.4.1 Instrumental**

Instrumental singular wordforms, despite their relatively low frequency (Samojlova & Slioussar 2014), are one of the easiest obliques to be recognized due to their unambiguity. In §2, we hypothesized that feminine instrumental -*oj* could be processed faster than masculine instrumental -*om* due to their homonymy with adjectival endings, and this prediction was borne out. Masculine -*om* marks different cases in nouns and adjectives ('ins' vs. 'loc', respectively), and this feature mismatch might negatively affect their processing. Feminine adjectival -*oj*, on the other hand, includes 'ins' as one of its possible interpretations; consequently, no conflict arises.

### **5.4.2 -U forms**

The -*u* forms ('acc.fem' and 'dat.masc') behave similarly to the unambiguous instrumentals, but this does not signify that their homonymy is accidental. The lack of significant difference in the processing speed of the two forms is compatible with the hypothesis of a shared underspecified representation, as suggested in Müller (2004); Wunderlich (2004). However, this evidence is not enough to

### Maria D. Vasilyeva

reject the accidental homonymy hypothesis. A better insight into this problem might be gained if we compare the processing of dative -*u* in the accusative environment and vice versa. If there is one shared representation for the two -*u*-s in the mental lexicon, such sentences, following Penke et al. (2004); Opitz et al. (2013), should show reduced ungrammaticality effects, if any.

### **5.4.3 -E forms**

The -*e* forms ('dat/loc.fem' vs. 'loc.masc') are most difficult to process in both noun groups, triggering longer RTs and lower accuracy, masculine -*e* being even more difficult with the slowest reaction times and the highest error rates. Feminine -*e* is largely believed to have a shared semantic representation for its two interpretations (Müller 2004; Wiese 2004; Wunderlich 2004). But a shared representation on its own is not a plausible source for such a processing cost. Masculine -*e* wordforms, on the contrary, are not ambiguous, but they are always governed by a preposition, and in the present study locatives were presented preposition-less in the experimental conditions. In experiment 2, locative plural, which is also preposition-dependent, actually, turned out to be one of the easiest oblique cases. Thus, absence of the preposition is not the main reason for poor participants' performance on singular masculine locatives.

We suggest that this finding could be accounted for in a model of Russian case where all -*e*-s have one shared representation. The features distinguishing between two cases compete with each other during wordform processing. Locative, as the more frequent reading (Samojlova & Slioussar 2014), has by default more weight, while dative gets more weight in the appropriate context, i.e. in the preposition-less environment. This competition slows down the recognition process. If we assume that the context cue prevails over the frequency cue, then for feminine nouns the dative reading succeeds. With masculine nouns, the context cue will lead to the incorrect selection of the dative features and cause non-word answers. The reanalysis of -*e* as 'loc' is, thus, warranted. As any reanalysis, it requires additional time cost, which explains the superiority of feminine -*e* forms over masculine -*e* forms in the processing speed.

### **5.4.4 Genitive**

Genitive wordforms behave differently in the two inflectional classes. Masculine genitive pattern together with the difficult -*e* in the RT analysis. Feminine genitive falls in the "easier" oblique group. Both genitive endings are homonymous:

### 19 Russian case inflection: Processing costs and benefits

feminine genitive -*y* coincides with nominative plural, masculine genitive -*a* – with the feminine nominative -*a*.

As far as the masculine genitive -*a* is concerned, an analysis similar to the analysis of -*e* forms is plausible. Two -*a*-s ('gen.masc' and 'nom.fem') have a shared representation in the mental lexicon. In a context-less condition, the nominative reading is preferred . Shared representation for these morphemes was previously proposed by Müller (2004); Wunderlich (2004). However, this analysis does not capture the fact that masculine genitive is processed more accurately than masculine locative.

As for feminine genitive in -*y*, the unanimous position (Müller 2004; Wunderlich 2004; Wiese 2004) stands for accidental homonymy. Genitive singular -*y* is more frequent than nominative plural (Samojlova & Slioussar 2014). Thus, fullform storage is more likely for nominative plural, following the suggestion by Bertram et al. (2000). Fullform access is assumed to be faster than the decomposition route (i.e., Bertram et al. 2000), yet we do not have enough evidence to claim that our -*y* forms were always processed as nominative plurals. In the singular environment of experiment 1, the singular reading might be chosen due to interstimulus priming. Nevertheless, whichever interpretation is chosen, it is easier to process than the ambiguous -*e*.

### **5.4.5 Zero nominative inflection**

The visual lexical decision task hints at a processing advantage for the phonologically non-overt inflection (- 'nom/acc.masc') over the phonologically overt inflection (-*a* 'nom.fem'). This contrasts with the null effect obtained previously in the auditory modality (Gor et al. 2017); note that non-significant effects are actually misleading, as they do not allow to conclude anything. Strictly speaking, these two forms differ not only in phonological overtness, but also in ambiguity: the -*a* wordform is unambiguous, while the - wordform also marks accusative in the discussed set of nouns. So this finding should be treated with caution.

### **6 Conclusion**

The results of our two experiments disagree with previous findings in Finnish, Uyghur, and Serbian, suggesting that differences in oblique case processing exist. Moreover, these differences arise both in transparent systems of case marking (Russian plural) and opaque or highly syncretic systems of case marking (Russian singular).

### Maria D. Vasilyeva

Data from the experiment with plural nouns suggests that case processing might be guided by Caha's (2008) functional case sequence. Results for singular nouns imply that different types of ambiguity are present in Russian declension. When the ambiguity is not accidental, context plays a major role in the selection of the interpretation.

### **Abbreviations**


### **Acknowledgements**

The author would like to thank Elena Gorbunova, Oleg Volkov and Nikita Loginov for their assistance with the participant recruitment as well as Olga Fedorova, Maria Falikman and two anonymous reviewers for their comments on the earlier version of this paper.

### **Appendix A: Experimental items**

Lemma frequency counts are given in brackets.

### **Feminine nouns** (47.39)

*anketa* 'questionnaire' (14.4), *arfa* 'harp' (2.6), *astra* 'aster' (3.8), *aura* 'aura' (5.1), *beseda* 'conversation' (87.5), *bukva* 'letter. character' (63.5), *data* 'date' (49.5), *doza* 'dose' (22.4), *dyuna* 'dune' (2), *fleita* 'flute' (5.8), *gazeta* 'newspaper' (237.5), *gitara* 'guitar' (22.2), *kareta* 'carriage' (9.4), *karta* 'map' (103), *kassa* 'cashier's desk' (20.9), *klumba* 'flower-bed' (8.7), *klyaksa* 'blot' (4.5), *kofta* 'jacket' (7.7), *lampa* 'lamp' (34), *lapa* 'paw' (39.7), *lenta* 'ribbon' (35.9), *lira* 'lyre' (8), *lyustra* 'lustre' (9.9), *mera* 'measure' (284.3), *minuta* 'minute' (344.2), *moneta* 'coin' (17.5), *norma* 'norm' (111.3), *orbita* 'orbite' (15), *pal'ma* 'palm tree' (14.3), *pasta* 'paste' (6.3), *pochva* 'soil' (56.2), *poza* 'pose' (29.8), *raketa* 'rocket' (62.9), *rama* 'frame' (21.2), *rana* 'wound' (29.4),*rasa* 'race' (5.9),*rifma* 'rhyme' (8.5),*roza* 'rose' (42.7),*shakhta* 'pit' (20.7), *shina* 'tire' (15.3), *shirma* 'folding-screen' (5.3), *shkola* 'school' (316),

*shlyapa* 'hat' (34.2),*shuba* 'furcoat' (18.7),*shvabra* 'mop' (3.4),*summa* 'sum' (130.6), *trassa* 'route' (32.5), *travma* 'trauma' (19.6), *tsifra* 'numeric' (62.2), *tsitata* 'citation' (21.5), *tykva* 'pumpkin' (5), *vaza* 'vase' (14.3), *yakhta* 'yacht' (9.5), *yurta* 'yurt' (2.7)

### **Masculine nouns** (47.37)

*al'bom* 'album' (23.7), *ananas* 'pineapple' (3.6), *aromat* 'aroma' (22.9), *aspekt* 'aspect' (35.6), *atom* 'atom' (20.5), *banan* 'banana' (7.3), *baton* 'loaf (of bread)' (5.3), *bufet* 'buffet' (20), *buton* 'bud' (4.6), *desert* 'dessert' (4), *diplom* 'diploma' (25.8), *divan* 'sofa' (60.1), *dzhip* 'jeep' (14.7), *fontan* 'fountain' (18.4), *frukt* 'fruit' (21.6), *gimn* 'hymn' (14.8), *ideal* 'ideal' (36), *kanat* 'rope' (9), *kapriz* 'caprice' (7.1), *kedr* 'cedar' (6.1), *khalat* 'bathrobe' (36.1), *klad* 'treasure' (7.5), *komod* 'dresser' (5.2), *kontur* 'contour' (15.3), *kostyum* 'costume. suit' (81.3), *kurort* 'resort' (12.8), *metall* 'metal' (57.5), *moment* 'moment' (306.8), *nrav* 'temper' (17.8), *ofis* 'office' (34.1), *period* 'period' (204.2), *plan* 'plan' (235.3), *pled* 'plaid' (4.8), *reis* 'flight. voyage' (22), *remont* 'reparation' (64.2), *ritm* 'rhythm' (30.6), *romb* 'rhombus' (1.8), *rulon* 'roll' (4.3), *servis* 'service' (14.6), *sezon* 'season' (69.2), *shram* 'scar' (10.7), *shtraf* 'forfeit' (32.3), *simvol* 'symbol' (46.4), *sous* 'sauce' (10.8), *syuzhet* 'storyline' (56.6), *teatr* 'theater' (305.3), *tekst* 'text' (146.2), *temp* 'tempo' (49), *tovar* 'item of goods' (115.5), *tsikl* 'cycle' (43.6), *virus* 'virus' (106.5), *vulkan* 'volcano' (6), *yarus* 'tier. layer' (6.5), *zhanr* 'genre' (36)

### **Appendix B: Results of statistical analyses**


Maria D. Vasilyeva


6:Experiment2(pluralnouns):finalmodelsforRTsand

### 19 Russian case inflection: Processing costs and benefits


Table 7: Experiment 1 (singular nouns): paired contrasts for feminine and masculine nouns in different cases (analyses with *p* ≤ .05 are given in bold)

### 19 Russian case inflection: Processing costs and benefits

Table 8: Experiment 2 (plural nouns): paired contrasts for feminine and masculine nouns in different cases (analyses with *p* ≤ .05 are given in bold)


### **References**


### Maria D. Vasilyeva


### 19 Russian case inflection: Processing costs and benefits


### Maria D. Vasilyeva


### **Chapter 20**

## **A puzzle about adverbials in simultaneous readings of present and past-under-past in Russian**

### Ekaterina Vostrikova

University of Massachusetts, Amherst

The present and past tense can both get the simultaneous interpretation in complement clauses when they are embedded under the past tense in Russian. However, I observe that the adverbials that are allowed with present tense in such contexts (for example, *sejčas* 'now') are not allowed with the past tense and vice verse (for example, *togda* 'then' is not allowed with the present). I show that simply restricting the meaning of those adverbials does not help due to the fact that tenses can be interpreted *de re*. In *de re* construals, tenses are interpreted outside of the clause they originate in, so no meaning conflict between the tense and the adverbial in the embedded clause is predicted. I propose that when a tense is interpreted *de re*, an adverbial has to be interpreted *de re* together with it. I show that under this assumption the observed restriction follows in a direct way.

**Keywords:** present tense, past tense, past-under-past, present-under-past, de re, attitude reports, temporal adverbials

### **1 Introduction**

### **1.1 Simultaneous readings of present-under-past and past-under-past in Russian**

In this paper I will discuss simultaneous readings that the present tense and the past tense can receive in complement clauses embedded under the past tense in Russian. I will point out that there are some restrictions on adverbials that can occur in such clauses and I will attempt to explain those restrictions.

Ekaterina Vostrikova. 2018. A puzzle about adverbials in simultaneous readings of present and past-under-past in Russian. In Denisa Lenertová, Roland Meyer, Radek Šimík & Luka Szucsich (eds.), *Advances in formal Slavic linguistics 2016*, 455–477. Berlin: Language Science Press. DOI:10.5281/zenodo.2545545

### Ekaterina Vostrikova

In Russian, the simultaneous reading of past tense in a complement clause embedded under past tense in a main clause is in principle available (Altshuler 2008). Usually, the simultaneous reading is not the most salient one. For example, the most salient reading of the sentence in (1), is the back shifted reading: according to Tanja, Putin was a president some time in the past with respect to 2016, the time when she pronounced the sentence.<sup>1</sup>

(1) V in 2016 2016 godu year Tanja Tanja skazala, say.past čto that Putin Putin byl be.past prezidentom president.inst Rossii. Russia.gen 'In 2016 Tanja said that Putin was the president of Russia.'

The simultaneous reading of past-under-past in (1) can be enforced by some adverbials, such as *togda* 'then'.<sup>2</sup> In (2) *togda* anaphorically refers to 2016 and the interpretation where the event of saying and the state of being the president of Russia overlap in time becomes the most salient.

(2) V in 2016 2016 godu year Tanja Tanja skazala, say.past čto that togda then Putin Putin byl be.past prezidentom president.inst Rossii. Russia.gen 'In 2016 Tanja said that Putin was the president of Russia then.'

Like Hebrew (Ogihara & Sharvit 2012), Russian also has a relative present tense. For example (3), where the verb in the embedded clause has the present tense features and the verb in the main clause has the past tense features, expresses the idea that Tanja said that Putin was president at the time when she pronounced the words 'Putin is the president'.<sup>3</sup>

(3) V in 2016 2016 godu year Tanja Tanja skazala, say.past čto that Putin Putin be.pres prezident president.nom Rossii. Russia.gen 'In 2016 Tanja said that Putin was the president of Russia.'

### **1.2 The adverbial puzzle**

Past-under-past and present-under-past in Russian both seem to be able to express the simultaneity of the time of the eventuality described by a complement

<sup>1</sup>The Russian judgment reported in this paper are my own judgments confirmed with other native speakers of Russian.

<sup>2</sup> In this respect, Russian behaves like Hebrew, as it was reported in Ogihara & Sharvit (2012).

<sup>3</sup>Note that present tense copula (indicated by and glossed as be.pres) is silent in Russian.

### 20 Adverbials in simultaneous readings of present and past-under-past

clause and the time when the embedded claim was made. However, even though *togda* really enforces the simultaneous reading of past-under-past as we saw in (2), it is completely unacceptable in a complement clause with the present tense embedded under the past tense; a relevant example is given in (4).<sup>4</sup>

(4) # V in 2016 2016 godu year Tanja Tanja skazala, say.past čto that Putin Putin togda then be.pres prezident president.nom Rossii. Russia.gen Intended: 'In 2016 Tanja said that Putin was the president of Russia then.'

If past-under-past and present-under-past can get the same interpretation, why is *then* possible in the embedded clause in the first case, but not in the second?

On the other hand, there are some adverbials, such as *sejčas* 'now', that are compatible with present-under-past in Russian (5), but not with past-under-past (6). Thus, the presence of *sejčas* in (6) makes it ill-formed, whereas without *sejčas* both sentences (5) and (6) can have the simultaneous reading.<sup>5</sup>


This is the adverbial puzzle that I will address in this paper. The fact that not all tenses are compatible with all adverbials has been previously noticed in the literature (for example, see the discussion in Hornstein 1990). What is special about the

<sup>4</sup>The symbol # is used, when the sentence or expression is ill-formed due to meaning.

<sup>5</sup> I do not translate (5) into English as 'When I talked to her 3 years ago, Tanja told me that she is pregnant now' because this English sentence does not have the relevant reading due to the fact that there is no relative present in English and *now* is indexical, unlike *sejčas* 'now'.

### Ekaterina Vostrikova

embedded contexts considered here is that past-under-past and present-underpast seem to be able to contribute the same meaning. Thus, it is not clear why there would be a meaning clash between an adverbial and the tense in one case but not in the other. Moreover, as I show in this paper, *togda* is an anaphoric element, and as such, it can pick different time intervals. There are many adverbial that denote a specific time interval and that are compatible with the present tense. However, there is something about the meaning of *togda* that makes it impossible for this element to pick the time intervals denoted by those adverbials. Another novel, to my knowledge, observation that I make in this paper is that the fact that *togda* and *sejčas* are distributed the way they are in the embedded contexts is not predicted by the existing theories of embedded tenses.

The discussion will go as follows. In §2 I will show that *togda* 'then' does not require past tense. Then I will provide the semantics of *togda* that accounts for the restriction on its use with present tense in Russian. I will suggest that *togda* carries a presupposition that the time intervals it picks are not equal to the evaluation time and will show how this presupposition accounts for the observed restrictions.

I will introduce my assumptions about the structure of the embedded clauses and the relative present in Russian and will show how the semantics of *togda* presented here correctly predicts the restrictions on its use in embedded contexts.

For the simultaneous reading of past-under-past I will adopt the classical *de re* approach (Abusch 1997; Heim 1994). I will show that the presupposition of *togda* that I am introducing is weak enough to make it compatible with the simultaneous reading of past-under-past.

In §3 I will show that the *de re* analysis of the simultaneous reading of pastunder-past incorrectly predicts that Russian *sejčas* 'now' should be able to appear in such a context. Since under the *de re* analysis the tense moves out of the embedded clause and is interpreted separately from the adverbial, no meaning clash is predicted between the past tense and the present-oriented adverbial *sejčas*. I will propose that this problem can be solved if we adopt an assumption that a tense and an adverbial are interpreted together. Since, when past tense gets the simultaneous reading under past in believe/say contexts, it is interpreted outside of the embedded clause, the adverbial *sejčas* has to be interpreted outside of the embedded clause as well.

In §4 I will show that a similar problem arises in English *then* is predicted to be compatible with the *de re* interpretation of the present tense (which derives the so-called double access reading). §5 summarizes the findings.

20 Adverbials in simultaneous readings of present and past-under-past

### **2 Why** *togda* **is not compatible with the present tense**

### **2.1** *Togda* **is not compatible with the present tense in matrix and embedded contexts**

*Togda* is an anaphoric element, in the sense that it makes a reference to a time interval that has been mentioned in the previous discourse. Thus in (7), it makes reference to the interval picked by *v prošlom godu* 'last year'. In (8) it is anaphoric to the future time interval, the interval picked by *čerez tri goda* 'in three years'.


However, example (9), where *togda* 'then' appears in a clause with present tense, is not acceptable. The reason for this must be that *togda* cannot pick the time interval denoted by *v ėtom godu* 'this year'.

(9) V in ėtom this godu year moj my syn son be.pres vo in vtorom second klasse. grade On he izučaet study.pres matematiku math (#togda). then (Intended:) 'This year my son is in the second grade. He studies math (this year).'

There is some general principle that restricts the use of adverbials with the present tense both in English and in Russian. For example, the sentence in (10) does not mean that I am running now and it is 5am now.<sup>6</sup> It is felicitous only on the planned future interpretation.<sup>7</sup>

<sup>6</sup>See Kamp & Reyle (1993) for a pragmatic explanation for this fact.

<sup>7</sup>As an anonymous reviewer points out, those sentences can be used felicitously with the present tense interpretation in some contexts. For example, (10) can be used if the previous discourse was 'No one believed that I will start running, but here I am, running at 5am'.

### Ekaterina Vostrikova


I would like to leave this more general problem out of the scope of the discussion here. In order to do so, I will compare *togda* with those adverbials that are completely compatible with the present tense.

One example of such adverbial is 'this year', as shown in (9). The question I will be focusing on is why in sentences like (9) *togda* cannot pick the same time interval as the one denoted by 'this year' and be compatible with the present tense given that it can easily pick the interval denoted by 'last year' in (7) and 'in three years' in (8).

We can see from the well-formedness of (12), where 'this year' occurs in the embedded clause (with the embedded present tense) and is anaphoric to '2016' of the main clause, that *v ėtom godu* 'this year' in Russian can pick a year that is current with respect to the local evaluation time (Tanja's 'now' at the time when she said those words).

(12) V in 2016 2016 godu year Tanja Tanja skazala say.past mne, me čto that v in ėtom this godu year ee her syn son ležit lie.pres v in bol'nice. hospital 'In 2016 Tanja told me that that year her son was in the hospital.'

In (13) *v ėtom godu* 'this year' occurs in the main clause and the sentence without *togda* has the simultaneous reading. The presence of *togda* makes this sentence ill-formed. Since *v ėtom godu* 'this year' is perfectly compatible with the present tense (embedded, as in (12) and unembedded, as in (14)), the badness of *togda* in (13) must be due to the fact that it somehow cannot refer to this interval.

(13) Ja I govorila talk.past s with Tanej Tanja v in ėtom this godu year I and ona she skazala say.past mne, me čto that ee her syn son (#togda) then vse all ešče still ležit lie.pres v in bol'nice. hospital (Intended:) 'I talked with Tanja this year and she told me that her son was still in the hospital (then).'

20 Adverbials in simultaneous readings of present and past-under-past

(14) Tanin Tanja's syn son v in ėtom this godu year vse all ešče still ležit lie.pres v in bol'nice. hospital 'Tanja's son is still in hospital this year.'

In principle, *togda* can pick an interval that is inside the interval denoted by 'this year' as it is shown in (15), where it occurs with the past tense embedded under past. In (15) *togda* anaphorically refers to the time when Tanja pronounced the words.

(15) Ja I govorila talk.past s with Tanej Tanja v in ėtom this godu year i and ona she skazala say.past mne, me čto that ee her syn son togda then vse all ešče still ležal lie.past v in bol'nice. hospital 'I talked to Tanja this year and she told me that her son was still in the hospital then.'

The question I will address here is why *togda* cannot denote a time interval that is compatible with present tense.

### **2.2** *Togda* **does not require past tense**

I will start this discussion by ruling out the simple idea that Russian *togda* requires past tense in the same clause to be licensed. One implementation of such an idea would be that *togda* has to agree with past tense and the agreement relation can only be established locally.

In Russian, there are several adverbials that have a meaning similar to *togda* and can occur in subordinate clauses with past tense embedded under past. They are listed in (16). The fact that all of them are good with past-under-past is shown in (17).


### Ekaterina Vostrikova

All of them are infelicitous with present tense embedded under past, as it is shown in (18).

(18) # Ja I govorila talk.past s with Tanej Tanja v in ėtom this godu year i and ona she skazala say.past mne, me čto that ee her syn son {v in to that vremja time / v in tot that moment moment / na on tot that moment} moment ležit lie.pres v in bol'nice. hospital Intended: 'I talked to Tanja this year and she told me that her son was in the hospital {at that time / at that moment / by that moment}.'

A strong argument against the hypothesis that all of those elements have to be licensed by past tense comes from the fact that all of them are in fact compatible with future tense in the same clause. One example where *togda* occurred in a matrix clause with a future tense was given in (8). In (19) I show that all of the adverbials given in (16) are compatible with an embedded future. The antecedent for *togda* or *na tot moment* 'at that moment' in (19) is given in a previous sentence and the resulting sentence is well-formed.

(19) My we obsuždali discuss.past 2019 2019 god. year Tanja Tanja skazala, say.past čto that {na on tot that moment moment / v in tot that moment moment / togda} then Medvedev Medvedev budet be.fut prezidentom. president.inst 'We discussed 2019. Tanja said that {by that moment / at that moment / then} Medvedev would be the president.'

We can conclude that it is not the case that *togda* (as well as other anaphoric elements that are compatible with past-under-past and incompatible with presentunder-past) needs to be licensed by the past tense in the same clause.

### **2.3 The semantics of** *togda*

I suggest that *togda* has the semantics given in (20). *Togda* carries an index that is mapped to a contextually given time interval (an interval *togda* is anaphoric to). It denotes a function of type ⟨*i*,*t*⟩: a function that takes a time interval and returns truth if that interval surrounds the contextually given time (translating this into the set-talk: it denotes a set of time intervals that surround the contextually given interval). The key part of this semantics is the presupposition that *togda* carries:

### 20 Adverbials in simultaneous readings of present and past-under-past

the time intervals it picks cannot be equal to the evaluation time with respect to which *togda* is interpreted.

(20) <sup>J</sup>togda5<sup>K</sup> *<sup>w</sup>*,*t*,*д*,*<sup>c</sup>* = *λt* ′ : *t* ′ , *t* . *д*(5) ⊆ *t* ′

A stronger presupposition that would also prevent *togda* from picking the time interval denoted by 'this year' would be that the time interval it picks does not overlap with the evaluation time. However, this would incorrectly predict that *togda* is incompatible with the simultaneous reading of past-under-past.

Let us consider what happens if we try to make *togda* to be anaphoric to the time interval denoted by 'this year'. The index 5 is mapped to the year long interval surrounding the evaluation time.

(21) *д*(5) = the year of *t*

Given those assumptions, the resulting meaning of *togda* with the index 5 is given in (22).

(22) <sup>J</sup>togda5<sup>K</sup> *<sup>w</sup>*,*t*,*д*,*<sup>c</sup>* = *λt* ′ : *t* ′ , *t* . the year of *t* ⊆ *t* ′

If we put together the semantics of *togda* given in (22) and present tense in a matrix context we will get a contradiction.

I will demonstrate this on the example of the second sentence of (9) that is given here separately as (23). The LF for it is given in (24).


I will assume that VPs like 'studies math' denote functions of type ⟨*e*, ⟨*i*,*t*⟩⟩. Thus, the vP gets the denotation of type ⟨*i*,*t*⟩. Let's assume that the assignment function *д* maps the index 7 to John.

(25) <sup>J</sup>vP(24)<sup>K</sup> *<sup>w</sup>*,*t*,*д*,*<sup>c</sup>* = *λt* ′ . John studies math at *t* ′

Since temporal adverbials like *togda* also denote functions of type ⟨*i*,*t*⟩ (predicates of times), they can combine with vPs via predicate modification. The result of this is given in (26).

(26) <sup>J</sup>vP′ (24)K *<sup>w</sup>*,*t*,*д*,*<sup>c</sup>* = *λt* ′ : *t* ′ , *t* . the year of *t* ⊆ *t* ′ & John studies math at *t* ′

### Ekaterina Vostrikova

I will adopt the pronominal semantics for tenses (Partee 1973). Tenses carry indices, thus, like other pronouns, they get their denotation via the assignment function *д*. The semantics for present tense that I will assume is given in (27). It simply denotes a specific time interval and presupposes that this time interval is equal to the evaluation time.

(27) <sup>J</sup>PRES4<sup>K</sup> *<sup>w</sup>*,*t*,*д*,*<sup>c</sup>* <sup>=</sup> *<sup>д</sup>*(4) <sup>J</sup>PRES4<sup>K</sup> *w*,*t*,*д*,*c* is only defined if *д*(4) = *t*

The predicate of times in (26) combines with the present tense via functionargument application. The predicted denotation for (24) is given in (28). *Togda* carries a presupposition that the time intervals it selects are not equal to the evaluation time. Present tense carries a presupposition that the time interval it denotes is equal to the evaluation time. What follows from this is that when *togda* and the present tense combine there will be a contradiction. This accounts for the infelicity of *togda* with present tense in matrix contexts.

(28) <sup>J</sup>IP(24)<sup>K</sup> *<sup>w</sup>*,*t*,*д*,*<sup>c</sup>* <sup>=</sup> <sup>T</sup> iff John studies math at *<sup>д</sup>*(4) and the year of *<sup>t</sup>* <sup>⊆</sup> *<sup>д</sup>*(4) <sup>J</sup>IP(24)<sup>K</sup> *w*,*t*,*д*,*c* is defined only if *д*(4) = *t* and *д*(4) , *t*

Note that nothing prevents *togda* from picking a time interval within the current year as long as it is in the past or future with respect to the evaluation time. This is a good prediction because we still want to account for the well-formedness of (15).

The contradiction is predicted to arise when *togda* is used in embedded clauses with the present tense as well.

In languages where present-under-past can get the simultaneous reading, it is standardly interpreted as a relative present: a tense that denotes a local evaluation time (Ogihara 1989; von Stechow 1995; Ogihara & Sharvit 2012).

The denotation for the relative present is given in (29): essentially it has the same denotation as the regular present in Russian.

(29) <sup>J</sup>PRES-REL1<sup>K</sup> *<sup>w</sup>*,*t*,*д*,*<sup>c</sup>* <sup>=</sup> *<sup>д</sup>*(1) <sup>J</sup>PRES-REL1<sup>K</sup> *w*,*t*,*д*,*c* is only defined if *д*(1) = *t*

I will make the following assumptions about the structure and the interpretation of embedded clauses in belief reports. Intensional verbs are quantifiers over world–time pairs.The intensional verb 'say' combines with its complement clause via a version of the rule of Intensional Functional Application (Heim & Kratzer 1998). An intension of an expression XP is computed as shown in (30).

### 20 Adverbials in simultaneous readings of present and past-under-past

(30) *λw* ′*λt* ′ . <sup>J</sup>XP<sup>K</sup> *w*′ ,*t* ′ ,*д*,*c*

I will make my point by using the example given in (31). The LF for the embedded clause is given in (32). For the simplicity of exposition, I reconstructed the subject to its base-position.


With these assumptions, I predict that the embedded clause with *togda* in our problematic sentence (31) will have the intension given in (33).

(33) *λw* ′*λt* ′ : *t* ′ , *t* ′ . Tanja's son is in hospital in *w* ′ at *t* ′ & the year of *t* ′ ⊆ *t* ′

This intension includes a contradictory presupposition, thus the infelicity of *togda* is predicted.

One natural question arising at this point is whether the presence of this presupposition is predicted to block the use of *togda* in simultaneous reading of past-under-past as well because this would not be the desired result, as the wellformedness of (2) shows. This is the question I will address in the next subsection.

### **2.4** *Togda* **and the simultaneous reading of past-under-past in Russian**

In order to account for the restriction on the use of *togda* with embedded and matrix present tense in Russian, I suggested that *togda* in Russian comes with a presupposition that the time intervals it picks are not equal to the local evaluation time.

The simultaneous reading of past-under-past in complement clauses in principle can be derived at least in two ways. One option is a past tense deletion rule. In this system, the past features on the embedded past are not interpreted and an embedded past is interpreted as a relative tense (Ogihara 1989; 1995). A relative tense is interpreted as a local evaluation time, thus, in this system, given the definition I proposed for *togda*, *togda* is predicted to be infelicitous with the simultaneous reading of the past tense.

### Ekaterina Vostrikova

But this is not the only way past tense could get the simultaneous interpretation. Another standardly assumed way of deriving the simultaneous reading is the *de re* construal (Abusch 1997; Heim 1994; Ogihara & Sharvit 2012). In what follows I will introduce the classic analysis of the *de re* construal and I will show that the presupposition of *togda* that I am proposing is not predicted to be in a conflict with the simultaneous reading of past-under-past in complement clauses.

Thus, I will assume that when *togda* is acceptable with the simultaneous reading of past-under-past, the simultaneous reading is derived via the *de re* construal.

I will show how the system works by using example (2), repeated here as (34).

(34) V in 2016 2016 godu year Tanja Tanja skazala, say.past čto that togda then Putin Putin byl be.past prezidentom president.inst Rossii. Russia.gen 'In 2016 Tanja said that Putin was the president of Russia then.'

Abusch proposed to extend the *de re* analysis for singular terms developed by Kaplan (1969), Lewis (1979) and Cresswell & von Stechow (1982) to the analysis of tenses in intensional contexts. In my exposition of the temporal *de re* construal I will use Cable's (2015) exposition of this system, which relies on Heim's (1994) implementation.

The past tense undergoes movement within the lower clause, leaving a trace t<sup>4</sup> and triggering lambda abstraction, indicated by 4 in Figure 1. The result of this movement is a predicate of times in the embedded clause.

After that, the past tense undergoes another short movement that is called the res-movement (Heim 1994). This type of movement is special because the moved element does not leave a trace and does not move to a c-commanding position.<sup>8</sup> It moves to the position of the sister of the verb *say*. Thus, this tense will be interpreted outside of the clause where it originates.

Intensional verbs like *say* are ambiguous between their regular denotation and the denotation given in (35).

<sup>8</sup>Due to those properties res-movement is highly controversial from the syntactic perspective. There is a less controversial way of deriving *de re* readings (developed for individual arguments) via concept generators that was proposed by Percus & Sauerland (2003).

### 20 Adverbials in simultaneous readings of present and past-under-past

The function denoted by *say* first combines with the tense that has been moved from the lower clause and now is its sister. Then it combines with the intension of the predicate of times created by the movement. After that, it takes an individual (the subject) and the time argument of the higher clause. Intensional verbs contribute quantification over time-concepts (relations between a world, time and another time). Those time-concepts should be understood as descriptions by which a believer represents a time interval to herself.

### Ekaterina Vostrikova

The denotation for *v 2016 godu* 'in 2016' is given in (36): it denotes a set of intervals within 2016.

(36) <sup>J</sup>v 2016 godu<sup>K</sup> *<sup>w</sup>*,*t*,*д*,*<sup>c</sup>* = *λt* ′ .*t* ′ ⊆ 2016

*Togda* in (34) can either anaphorically refer to 2016 or the time in 2016 when Tanja said the words. I will assume the first option (but nothing hinges on this choice): the assignment function *д* maps index 7 on *togda* to the set of intervals in 2016. *Togda* will denote the set of intervals that surround 2016.

$$\begin{aligned} \text{(37)} \quad \left[\text{togda}\_7\right]^{\left[\text{w},t,g,c\right]} &= \lambda t' : t' \neq t \ . \text{g}(7) \subseteq t'\\ &= \lambda t' : t' \neq t \ . \text{2016} \subseteq t' \end{aligned}$$

The intension of the predicate of times is computed in (38). In this system, the time of Putin's presidency (in Tanja's say-alternatives) and the local evaluation time are two distinct times. *Togda* contributes the presupposition that those two intervals are not equal to each other.<sup>9</sup>

(38) *λwλt* . <sup>J</sup>4 [t<sup>4</sup> [togda<sup>7</sup> Putin be president] …]<sup>K</sup> *w*,*t*,*д*,*c* = *λwλtλt* ′ : *t* ′ , *t* . Putin is the president in *w* at *t* ′ & 2016 ⊆ *t* ′

The resulting semantics for the entire sentence is given in (39).

(39) <sup>J</sup>Figure <sup>1</sup><sup>K</sup> *<sup>w</sup>*,*t*,*д*,*<sup>c</sup>* = <sup>T</sup> iff ∃*P* : *д*(4) = the time *z* such that *P*(*w*)(*д*(2))(*z*) & *д*(2) ⊆ 2016 & ∀⟨*w* ′′ ,*t*''⟩ ∈ SAY-ALT(Tanja,*w*,*д*(2)) : [*λt* ′ : *t* ′ , *t* ′′ . Putin is the president in *w* ′′ at *t* ′ & 2016 ⊆ *t* ′ ](the *z* such that *P*(*w* ′′)(*t* ′′)(*z*)) = <sup>T</sup> <sup>J</sup>Figure <sup>1</sup><sup>K</sup> *w*,*t*,*д*,*c* is defined only if *д*(2) < *t* and *д*(4) < *t*

This sentence is predicted to be true in case there is a time concept *P* that relates the particular time in the past when Tanja pronounced those words in the actual

(i) *λwλt* . <sup>J</sup>4 [t<sup>4</sup> [togda<sup>7</sup> Putin be president] …]<sup>K</sup> *w*,*t*,*д*,*c* = *λwλtλt* ′ : *t* ′ , *t*&¬*t* ′ > *t* . Putin is the president in *w* at *t* ′ & 2016 ⊆ *t* ′

<sup>9</sup>The full *de re* analysis requires another presupposition in the embedded clause that the time of the state or eventuality described in the embedded clause is not in the future with respect to the local evaluation time (the upper limit constraint; cf. Abusch 1997). The full intension of the embedded clause is shown in (i). This presupposition is responsible for the fact that past-under-past cannot have the forward shifted reading.

### 20 Adverbials in simultaneous readings of present and past-under-past

world and the past moment denoted by the moved past tense such that the same relation also holds between the time when Tanja located herself in her doxastic alternatives (her local now) and the time when Putin is the president in her doxastic alternatives.

One such possible time concept in the case under consideration is given in (40).

(40) *λwλt* ′*λt* ′′ .*t* ′′ is a year-long interval that surrounds *t* ′ in *w*

The two intervals this concept relates are not equal to each other: one surrounds the other one, thus the presupposition introduced by *togda* is satisfied. The existence of the concept given in (40) can make the entire formula in (39) true. The presupposition requires that *д*(4) and *д*(2) are in the past. Given this timeconcept, the first conjunct in (39) is as follows (41).

(41) *д*(4) = the time *z* such that *z* is a year-long interval that covers the time *д*(2) (the time when Tanja said those words) & *д*(2) ⊆ 2016

The second conjunct is also true: in all of Tanja's doxastic alternatives, Putin is president at the time *z* such that *z* is a year-long interval that surrounds her local 'now' (at the time when she said the words) and 2016 ⊆ *z*.

Since the temporal *de re* construal derives the simultaneous reading of pastunder-past without requiring that the two time intervals are exactly equal, the presupposition that *togda* carries is not going to be harmful for the meaning of the sentence. Thus, the presupposition that I am proposing is strong enough to rule out *togda* with present-under-past, but is weak enough to make it compatible with a simultaneous reading of past-under-past.

The semantics (39) also accounts for the fact that *togda* enforces the simultaneous reading.<sup>10</sup>

*Togda* picks the intervals that surround the time it is anaphoric to. When *togda* is in an embedded say-context and it is anaphoric to the time of saying, it is predicted to contribute the claim that what is described by the embedded sentence is happening at the time that surrounds the time of saying (from the speaker's perspective).

<sup>10</sup>I would like to thank an anonymous reviewer who made this point.

### Ekaterina Vostrikova

### **3** *Sejčas* **in simultaneous readings of past-under-past in Russian**

The set of assumptions that made it possible for us to derive the compatibility of *togda* with the simultaneous reading of past-under-past in complement clauses, leads to the prediction that *sejčas* ('now') should be acceptable in those contexts as well. Thus the contrast between (5) and (6) (repeated here as (42) and (43)) is not predicted.


In (42) *sejčas* 'now' is acceptable, which means that *sejčas* is not an unshiftable indexical in Russian. In (42) *sejčas* picks the time interval three years ago when the conversation happened. Thus I will treat *sejčas* as sensitive to the evaluation time and not to the context time as shown in (44): *sejčas* denotes a predicate of times that is true of intervals that surround the local evaluation time. (If instead of surrounding we chose a relation of being equal to, it would not have any significant effect on the final outcome of the system.)

(44) <sup>J</sup>sejčas<sup>K</sup> *<sup>w</sup>*,*t*,*д*,*<sup>c</sup>* = *λt* ′ .*t* ⊆ *t* ′

Under those assumptions the fact that *sejčas* is acceptable in (42) follows straightforwardly (with the assumption that present tense in Russian can be interpreted as a relative present).

The main interest for us here is the example (43) and the fact that *sejčas* is not acceptable in this context. Again I will assume the *de re* construal for the simultaneous reading of past-under-past in (43). The LF that will be interpreted here, namely Figure 2, is structurally identical to the one given in Figure 1 (but the lexical items are different).

### 20 Adverbials in simultaneous readings of present and past-under-past

Figure 2: LF of the relevant part of (43)

Again, the attitude verb combines with its res-argument (the tense that was moved from the lower clause), the intension of the predicate of times created by the movement, an individual (the matrix subject), and the time argument of the matrix clause. The intension of the embedded clause is given in (45) (under the assumption that *д* maps index 7 to Tanja).

(45) *λwλt* . <sup>J</sup>4 [t<sup>4</sup> [*sejčas* she<sup>7</sup> is pregnant]]<sup>K</sup> *w*,*t*,*д*,*c* = *λwλtλt* ′ .*t* ⊆ *t* ′ & Tanja is pregnant in *w* at *t*

The resulting semantics for the entire sentence is given in (46).

(46) <sup>J</sup>Figure <sup>2</sup><sup>K</sup> *<sup>w</sup>*,*t*,*д*,*<sup>c</sup>* = <sup>T</sup> iff

∃*P* : *д*(4) = the time *z* such that *P*(*w*)(*д*(2))(*z*) & *д*(2) is a time 3 years ago & ∀⟨*w* ′′ ,*t* ′′⟩ ∈ SAY-ALT(Tanja,*w*,*д*(2)) : [*λt* ′ .*t* ′′ ⊆ *t* ′ & Tanja is pregnant in *w* ′′ at *t* ′ ](the *z* such that *P*(*w* ′′)(*t* ′′)(*z*)) = <sup>T</sup>

′

<sup>J</sup>Figure <sup>2</sup><sup>K</sup> *w*,*t*,*д*,*c* is defined only if *д*(2) < *t* and *д*(4) < *t*

### Ekaterina Vostrikova

The contribution that *sejčas* ends up making is that the time of the state described by the embedded clause (Tanja's pregnancy) surrounds the time when Tanja locates herself in her doxastic alternatives at the time of saying. This should give us the simultaneous reading.

One possible concept that will be suitable in this case is given in (47).

(47) *λwλt* ′*λt* ′′ .*t* ′′ is a 9-month interval that surrounds *t* ′ in *w*

In the actual world, *д*(4) (past time from the embedded clause) is the *z* such that it is the 9-month interval that surrounds *д*(2) (past time of saying). In Tanja's alternatives Tanja is pregnant at the time *z* such that *z* is the 9-month interval that surrounds her local now.

Intuitively it is clear that the clash happens because *sejčas* has a present tense orientation and it is not compatible with the past tense. However, the past tense under the *de re* analysis of the simultaneous reading of past-under-past is not interpreted in the same clause as *sejčas*, thus no clash is predicted. Moreover the presence of *sejčas* in the sentence is predicted to enforce the simultaneous reading of past-under-past the way 'then' enforces it, due to the fact that *sejčas* picks intervals that surround the local evaluation time.

In order to account for the fact observed in (43) I suggest that when tense is interpreted outside of the embedded clause, *sejčas* is interpreted together with it.

This can be implemented in a system where tense and the adverbial undergo the res-movement together. To move *sejčas* together with tense, I will allow tense to combine with adverbials directly: I will change the denotation of tenses and suggest that they take predicates of times (like the one denoted by *sejčas* or *togda*) as their first arguments (48). I will consider tense pronouns to be definite articles of times: they combine with a predicate of times and return a specific time interval. In doing so I do not derail in a significant way from the pronominal semantics of tense. I adopt the idea that all pronouns are definite articles (Elbourne 2005). The pronominal element is still there in the semantics of tense suggested in (48). In this system, just like in the classic pronominal approach to the semantics of tense, past tense denotes a particular interval of time. An adverbial acts like a restrictor on the possible intervals that the tense can denote.

$$\begin{aligned} \text{(48)} \quad & \text{[PAST\_2]} \quad & \mu, t, g, c = \lambda P\_{\langle i, t \rangle} \text{ } \mu t' \text{ } P(t') = \text{\textquotedblleft } t' = g(2) \\ & \text{[PAST\_2]} \quad & \text{[^{\text{\textquotedblleft}}{}] \text{\textquotedblright}, \mu, c \text{ }] \text{ defined only if } g(2) < t \end{aligned} $$

If *sejčas* undergoes res-movement to the matrix clause together with the past tense, the restriction on the use of *sejčas* in simultaneous readings of past-underpast that we observe in (43) follows directly. The predicted result of applying

### 20 Adverbials in simultaneous readings of present and past-under-past

past to *sejčas* is given in (49). This is because given our definition in (44) the time interval denoted by *sejčas* has to surround the evaluation time, which for the matrix clause is the time of evaluation of the entire sentence, i.e. now. The time interval denoted by the past has to strictly precede the evaluation time. There is no interval that is simultaneously strictly in the past with respect to the current moment and surrounds it. Thus the clash between the past tense and *sejčas* is predicted.

(49) <sup>J</sup>PAST2<sup>K</sup> *w*,*t*,*д*,*c* (Jsejčas<sup>K</sup> *w*,*t*,*д*,*c* ) = *ιt* ′ *t* ⊆ *t* ′ & *t* ′ = *д*(2) <sup>J</sup>PAST2<sup>K</sup> *w*,*t*,*д*,*c* (Jsejčas<sup>K</sup> *w*,*t*,*д*,*c* ) is defined only if *д*(2) < *t*

### **4** *Then* **with present-under-past in English**

In English, present-under-past in complement clauses cannot get the simultaneous reading. The English present tense cannot be interpreted as a relative present. The absence of the relative present reading in (50) shows that the English present tense is sensitive to the context time and not the evaluation time. In (50) presentunder-past gets only the so-called double access reading.

(50) This year Tanja said that Putin is the president of Russia.

This reading requires that if what Tanja said was true when she said it, then Putin must be the president of Russia now. This reading requires the embedded claim to be true at both the matrix utterance time and at the time of the doxastic alternatives. Abusch (1997) has shown that this reading can be derived if we interpret present tense of the embedded clause in (50) *de re*.

The present tense undergoes res-movement (Heim 1994). This creates the LF structurally similar to the one given in Figure 1. Again, given the semantics for *say* in (35) there is a parallelism requirement on the relation between the present tense moved from the embedded clause and the past moment of saying on the one hand and the relation between the time of the presidency in Tanja's sayalternatives and the time when she locates herself on the other. Due to a constraint on the interpretation of embedded tenses called the upper limit constraint – the idea that tense of an embedded clause cannot be a future directed concept – this cannot be the relation of the past preceding present and Putin's presidency being in the future with respect to Tanja's local now (Abusch 1997). The only other option is the relation of the surrounding, where the interval denoted by the present tense surrounds the one denoted by the past.

### Ekaterina Vostrikova

In English *then* is also not compatible with present-under-past (51).

(51) #This year Tanja said that Putin is the president of Russia then.

Even if English *then* has the same denotation as Russian *togda* and carries the relevant presupposition, the restriction observed in (51) does not follow unless we make an assumption that the adverbial has to undergo the res-movement together with the present tense. This is the same problem as the one we saw with the Russian *sejčas* in *de re* construals.

In a *de re* construal tense moves out of the clause it originates in. The presupposition of the non equality between the evaluation time and the time intervals *then* picks will translate in this system into the requirement of non-identity of the time of presidency and the time when Tanja locates herself. This is not problematic, given that relation between them is the relation of surrounding. *This year* is an adverbial that is compatible with the present tense in English, thus if *then* can be anaphoric to this adverbial, no clash is predicted between *then* and the present tense.

The restriction we observe in (51) is straightforwardly predicted in the system where the English *then* has the same denotation as the Russian *togda,* shown in (52), and tense adverbials are interpreted together with tense. If tense undergoes the res-movement, the adverbial has to move with it.

$$\text{(52)} \quad \left[\text{then}\_5\right]^{\le,t,g,c} = \lambda t' : t' \ne t : g(5) \subseteq t'$$

If we extend the analysis suggested here for the Russian *sejčas-*cases to English cases with *then*, the fact observed in (51) follows without any further assumptions. Present tense takes *then* as its argument. The result of this is shown (53). Since *then* moves together with the tense and is also interpreted in the matrix clause, there is predicted to be a clash between the presupposition of the present tense (that it denotes the time interval equal to the context time that equals to the matrix evaluation time) and the presupposition of *then* (that the intervals it picks are not equal to the evaluation time). This is shown in (53).

$$\begin{aligned} \text{(53)} \quad & [\text{PRES}\_2]^{\bowtie, t, g, c}([\text{the}\_5]^{\bowtie, t, g, c}) = \iota t' \, g(5) \sqsubset t' \, \&\ \ \iota' = g(2) \\ & [\text{PRES}\_2]^{\bowtie, t, g, c}([\text{the}\_5]^{\bowtie, t, g, c}) \text{ is defined only if } g(2) = t\_c \text{ and } g(2) \neq t\_c \end{aligned}$$

Thus if we extend the analysis suggested here for the Russian *sejčas-*cases with the embedded past to English *then*-cases with the embedded present, the illformedness of (51) follows without any further assumptions.

### **5 Conclusion**

In this paper, I looked at simultaneous readings of present-under-past and pastunder-past in complement clauses in Russian. I have formulated the adverbial puzzle: there are adverbials like *togda* 'then' that can enforce the simultaneous reading of past-under-past but are completely infelicitous with present-underpast; and there are adverbials like *sejčas* 'now' that are compatible with an embedded relative present, but not with past-under-past.

I suggested that the restriction on the use of *togda* in Russian can be explained if *togda* carries a presupposition that the time intervals it picks are not equal to the evaluation time. I have shown that this presupposition is strong enough to make *togda* incompatible with the relative present, however weak enough to be compatible with the simultaneous reading of past-under-past. The reason for this is that the simultaneous reading of past-under-past in Russian is derived via *de re* construal and the meaning resulting from this construal does not require that the two intervals are equal, it is enough for them to simply overlap.

I have demonstrated that the fact that *sejčas* is felicitous with present-underpast in Russian and is not acceptable with the simultaneous reading of pastunder-past does not follow from the classic *de re* analysis of simultaneous readings of past-under-past. The reason for this is that since the past tense moves out of the embedded clause, no meaning clash is predicted between the meaning of the present oriented adverbial *sejčas* and the past tense. I have shown that the fact that *sejčas* is infelicitous with past-under-past in Russian follows straightforwardly if we allow it to be interpreted *de re* together with an embedded past tense. I extended this analysis to explain the fact that *then* is not compatible with present-under-past in English.

### **Abbreviations**


### Ekaterina Vostrikova

### **Acknowledgements**

I would like to thank Seth Cable for his help with this project. I am also grateful to Barbara Partee, Petr Kusliy, Sakshi Bhatia and the participants of the semantics seminar on tense and aspect at UMass (Fall 2015) and FDSL 12 for their useful comments and feedback. Also I would like to thank the editors of the volume and two anonymous reviewers for their comments and suggestions.

### **References**


### 20 Adverbials in simultaneous readings of present and past-under-past


### **Chapter 21**

## **How factive is the perfective? On the interaction between perfectivity and factivity in Polish**

### Karolina Zuchewicz

Leibniz-Zentrum Allgemeine Sprachwissenschaft & Humboldt-Universität zu Berlin

> This paper aims to provide evidence for a systematic correlation between the perfective aspect of the matrix verb and the factive interpretation of embedded object sentences in Polish. Embedding by perfective matrix verbs makes propositions systematically 'more factive' than embedding by their imperfective counterparts. The strength of the inference depends on the semantic class the verb belongs to. The perfective operator introduces a nearly undefined truthfulness feature, which is specified as factive, veridical or reliable depending on the relation between the truth of the proposition expressed by the embedded clause and the event described by the matrix verb.

> **Keywords:** perfectivity, factivity, veridicality, presupposition, entailment, implicature

### **1 Introduction**

There is no one simple way to define factivity, and especially the truth-related inferences in general. In this paper, I will adopt assumptions which can be used to describe perfectivity-dependent truthfulness in Polish. It should be pointed out that the whole spectrum of the literature available is much broader.

According to Kiparsky & Kiparsky (1970), a verb *V* that takes a that-clause *p* is called factive if asserting *Vp* presupposes the truth of the complement *p* (but see also Karttunen 1971 for a discussion of the presuppositional account). Following Egré (2008: 101), a verb *V* is called veridical if it entails the truth of

### Karolina Zuchewicz

its complement when used in the positive declarative form, more precisely if it satisfies the scheme *Vp* → *p* for all *p*, where *p* is a that-clause.

I will refer to the first option as truth presupposition (factivity in a common sense). It holds when the inference remains under negation or after the insertion of a modal adverbial (for a semantic definition of presupposition, see Strawson 1950). Truth presupposition concerns for instance perfective *przewidzieć*, as can be seen in the following examples. '≫' marks presupposition.<sup>1</sup>


Examples (1) and (2) consist of an aspectual minimal pair exhibiting complementary behavior of the feature [±perfective] with respect to the enforcing of a factive interpretation of their complement sentences. Whereas the perfective variant presupposes the truth of its sentential argument, the imperfective one does not. After the insertion of a sentence negation or a modal adverbial, (1) implicates that Marek fears ghosts. Sentence (2) only says that Ola was guessing / tried to predict that Marek fears ghosts, but it leaves it open whether she was correct or not.

The second option will be called truth entailment. Truth entailment results in an occurrence of a veridical meaning of the proposition expressed by the subordinate clause. Here, the inference is present in affirmative sentences, but it does not project. We can find it for example in the perfective *potwierdzić*. I will use '→' to mark entailment.

<sup>1</sup>All embedded verbs are marked for the imperfective aspect and used in the present tense in order to exclude the influence of perfectivity and past tense morphology within the subordinate clause on the truth inferences observed.

21 How factive is the perfective?


Truth entailment can be found in *potwierdzić* and is absent in *potwierdzać*; whereas it seems to follow from (3) that Marek fears ghosts, (4) states that this is a possible, but not an obligatory interpretation. The inference presented in (3) does not project, which excludes it from being a presupposition. Consider (5).

(5) Komisarz commissioner nie neg / prawdopodobnie probably potwierdził, confirmed.pfv że that Marek Marek boi fears.ipfv się refl duchów. ghost.pl 'The commissioner did not confirm / probably confirmed that Marek fears ghosts.' ↛ Marek fears ghosts.

The third and weakest option is the truth implicature (see also Hacquard 2006 for the so-called actuality implicature, as illustrated in (9)). I will use this term to refer to an inference which cannot be captured by factivity or veridicality (it is clearly pragmatic, since it can be canceled). Here, the proposition embedded under a perfective communication verb is taken for granted due to the reliability of the sentence subject (cf. Schlenker 2010 for the factivity of announcements). The same proposition embedded under a particular imperfective counterpart is neutral with respect to the reliability condition. Consider examples (6a) and (6b). I will use '↝' to mark implicature.

	- b. Ela Ela mówiła, said.ipfv że that jest is.ipfv w at pracy. work 'Ela was saying that she is at work.' ↝̸ Ela is at work.

### Karolina Zuchewicz

In the perfective variant (6a), the speaker takes the truth of what the sentence subject said for granted; Ela is considered reliable and she is expected to tell the truth. In the imperfective variant (6b), the speaker does not want to commit herself to the truth of the proposition. It is left open whether the speaker considers the sentence subject reliable. As a result, there is no implicature that Ela is at work. The reliability effect seems to correlate with fulfilling all the parts of the speech act (see §4.3), which is necessarily the case when using the perfective and which does not have to be the case when using the imperfective (see Cohen & Krifka 2014; Krifka 2015 for commitment space semantics).

We could refer to the abovementioned inferences as a truthfulness scale. The strongest inference – truth presupposition – represents the highest value on that scale, while truth implicature stands for the lowest one. Between them there is the medium strength inference – truth entailment. A more detailed classification may be developed after more conclusive research has been done.

Perfectivity-dependent truthfulness needs to be distinguished from the truthfulness of inherently factive imperfectives, where the perfectivizing operation only results in the specification of a temporal boundary of an event, for instance: *żałować* 'regret.ipfv' vs. *pożałować* 'start regretting.pfv' (cf. Egré 2008 for an interesting discussion about *regret*, though), unless the meaning of the derivate becomes non-compositional (*wiedzieć* 'know.ipfv' + factive vs. *powiedzieć* 'say.pfv' + non-factive). In contrast, all truth inferences which originate from the perfective do not occur in the case of the respective imperfective counterparts.

At this point, I would like to make an important remark concerning the tense of the matrix verb. I use the past tense in all examples, because it is available for any verbal stem regardless of the aspectual marking. The present tense morphology results in future reference in the case of the perfective, whereas both present tense and the periphrastic future construction are available for the imperfective. Because the analyzed sentences are supposed to be minimal pairs (differing only in the aspectual marking on the matrix verb), using the past tense was the only option.

In this paper, I will examine different verbs falling into class 1 (truth presupposition), class 2 (truth entailment) and class 3 (truth implicature). I provide an account of perfectivity-dependent truth inferences in Polish, which will be presented in §5. Before coming to that, I will briefly discuss the influence of aspect on the interpretation of nominal arguments, which serves as a starting point for an investigation of the correlation between the perfectivity of the matrix verb and the interpretation of complement sentences.

21 How factive is the perfective?

### **2 Aspect and the interpretation of nominal arguments**

It has been pointed out by Wierzbicka (1967) that in perfective sentences in Polish the direct object is interpreted as definite, while in imperfective ones it is understood as indefinite. Consider example (7).

(7) On he zjadł ate.pfv / jadł ate.ipfv orzechy. nut.pl 'He ate all of the nuts / was eating (some) nuts.'

In the case of *zjadł*, the reference is to a definite group of entities – the nuts. The object is completely affected by the verbal process (as a result, there are no more nuts left). In contrast, neither the definite nor the totality reading is enforced when using *jadł*. Here, the partitive interpretation is available amongst others, corresponding to 'some of the nuts'.

However, Filip (2005: 128) shows that perfective aspect does not always require that bare nominal arguments in its scope refer to one whole and specific individual (consider for instance the perfective Czech and Polish equivalents of the English verb *bring*). That means that not only aspect, but also verb semantics and especially the thematic relation between the nominal object and the verb determine the referential properties of the entire predicate.

The crucial point is that the perfective operator can take scope over both the matrix verb and its nominal complement. A formal analysis of this correlation has been developed by Krifka (1989a,b; 1992; 1998). Different theoretical implementations are possible; because it is not the main focus of this paper, I will not discuss them in greater detail.

According to Krifka, complex verbal expressions (verb plus direct object) have either a cumulative or a quantized reference. We can define them in terms of the sum operation: *x* ⊔ *y* 'the sum of *x* and *y*'. For example, the sum of two events of 'eating grapes' still yields an event of 'eating grapes'.The predicate 'eating grapes' has a cumulative reference – we can apply it not only to the single events, but also to the sum of them. In contrast, the joining of two events of 'eating two grapes' can no longer be described with 'eating two grapes', because 'eating two grapes' plus 'eating two grapes' does not equal 'eating two grapes'. The predicate 'eating two grapes' has a quantized reference – we can apply it to the single events, but not to the sum of them. Apart from the sum operation, the proper part relation can be defined: *<sup>x</sup>* <sup>&</sup>lt; *<sup>y</sup>* <sup>↔</sup> *<sup>x</sup>* <sup>⊑</sup> *<sup>y</sup>* and *<sup>x</sup>* , *<sup>y</sup>*. For example, there is no proper part of an event 'eating two grapes' which is an event of 'eating two grapes'. This illustrates another property of the quantized reference.

### Karolina Zuchewicz

Krifka assumes that the perfective operator presupposes the quantization of the entire predicate, whereas the imperfective operator requires its cumulativity. This correlation primarily (but not exclusively, cf. Krifka 1998) holds for predicates which allow mapping of objects to events and vice versa (the so-called homomorphism of objects to events). Certainly, it cannot be considered a 1:1 relationship (cf. Filip 1996; 2005 or Borik 2006). In Polish, the verbal predicate marked with the perfective aspect is quantized iff the whole verbal complex receives a telic interpretation, particularly in the case of predicates with nominal objects that are incremental themes. On the other hand, we get the combination of features [+perfective] and [–telic] after adding the delimitative prefix *po*- to the imperfective stem; in these cases the predicate is to be interpreted as atelic despite the perfective marking on the verb. Thus, the following generalization holds for Polish: telicity implies quantization, but perfectivity does not imply telicity (see also Gehrke 2008). As will become clear later, the truth inference of a sentential complement is triggered by perfectivity.

### **3 Cross-linguistic evidence for the interaction between perfectivity and factivity**

The influence of the perfective aspect of a matrix verb on the factive interpretation of complement clauses has already been observed. Hacquard (2006) shows that both actuality entailment and actuality implicature can be found in some modal constructions in French, when a modal is marked with the perfective.

Actuality entailment refers to the uncancelable inference stating that the proposition expressed by the complement clause holds in the actual world.<sup>2</sup> Consider (8), adapted from Hacquard (2006: 21).

(8) Jane Jane #a aux pu could.pfv / pouvait could.ipfv soulever lift cette this table, table mais but elle she ne neg l'a it.aux pas neg soulevée. lift 'Jane could lift this table, but she did not lift it.'

Example (9) demonstrates an actuality implicature (adapted from ibid. 16).

<sup>2</sup>Bhatt (1999) observed the correlation between perfectivity marked on ability modals and the presence of the actuality entailment in Greek and Hindi.

21 How factive is the perfective?

(9) Darcy Darcy a aux eu had.pfv / avait had.ipfv la the possibilité possibility de to rencontrer meet Lizzie. Lizzie 'Darcy had the possibility to meet Lizzie.'

When used with the perfective, (9) strongly suggests (but does not entail) that Darcy did meet Lizzie.

The correlation between perfectivity and factivity can also be seen in Hungarian; it concerns the influence of embedding verbs of saying on the interpretation of their sentential complements. Whereas *megmond* 'say.pfv' requires the argument to be true, *mond* 'say.ipfv' does not (see Kiefer 1986). Even though aspect is not grammaticalized in Hungarian (it is not obligatory for every verb to have its (im)perfective twin), informal investigations among speakers show that we can observe clear aspect-dependent differences with respect to the truthfulness of propositions embedded under verbs marked as perfective.

In the next section I am going to present Polish data showing a systematic interaction between perfectivity and truthfulness. In Polish, the category of aspect is fully grammaticalized, which allows us to take a closer look at the abovementioned dependency.

### **4 Aspect-dependent truth inferences in Polish**

### **4.1 Case 1: Truth presupposition**

One group of verbs where the truth presupposition of the perfective can be found is verbs of guessing.<sup>3</sup> From (10) it follows that the proposition from the embedded clause – Marek fears ghosts – is true. Example (11) demonstrates that this inference projects, i.e. it remains under negation and after the insertion of a modal adverbial.

(10) Jan Jan zgadł guessed.pfv / wyczuł, sensed.pfv że that Marek Marek boi fears.ipfv się refl duchów. ghost.pl 'Jan guessed that Marek fears ghosts.' ≫ Marek fears ghosts.

<sup>3</sup>The strength of the inference may also depend on aktionsart. For example, a resultative verb *wyczuć* 'sense.pfv' is factive, whereas the inchoative *poczuć* 'start feeling.pfv' is not (a similar observation holds for Czech, Radek Šimík, p.c.). It seems that inchoativity does not give rise to factivity, but to a weak truth implicature.

### Karolina Zuchewicz

(11) Jan Jan nie neg / prawdopodobnie probably zgadł guessed.pfv / wyczuł, sensed.pfv że that Marek Marek boi fears.ipfv się refl duchów. ghost.pl 'Jan did not guess / probably guessed that Marek fears ghosts.' ≫ Marek fears ghosts.

Contrary to this, no such inference appears with the particular imperfective counterparts. Example (12) shows that there is no entailment, let alone presupposition, that Marek fears ghosts when the subordinate clause is embedded under the imperfective variants of 'guess' / 'sense'.

(12) Jan Jan zgadywał guessed.ipfv / wyczuwał, sensed.ipfv że that Marek Marek boi fears.ipfv się refl duchów. ghost.pl 'Jan supposed that Marek fears ghosts.' ↛ Marek fears ghosts.

As expected, the truth inference is also absent under negation and after the insertion of a modal adverbial. Consider (13).

(13) Jan Jan nie neg / prawdopodobnie probably zgadywał guessed.ipfv / wyczuwał, sensed.ipfv że that Marek Marek boi fears.ipfv się refl duchów. ghost.pl 'Jan did not suppose / probably supposed that Marek fears ghosts.' ↛ Marek fears ghosts.

Examples (12) and (13) leave it open whether it is true that Marek fears ghosts. Other members of this class are: *odkryć, odkrywać* 'discover', *rozgryźć, rozgryzać* 'figure out', and *rozpoznać, rozpoznawać* 'identify'.

### **4.2 Case 2: Truth entailment**

Many perfective matrix verbs show an implicative behavior with respect to the truth inference of the proposition from the subordinate clause. For instance, verbs of proving seem to entail that their sentential argument is true, which can be seen in (14). *Udowodnić* and *wykazać* are much stronger in their veridicality than *pokazać* however.

21 How factive is the perfective?

(14) Jan Jan udowodnił proved.pfv / wykazał revealed.pfv / pokazał, showed.pfv że that Marek Marek boi fears.ipfv się refl duchów. ghost.pl 'Jan proved / revealed / showed that Marek fears ghosts.' → Marek fears ghosts.

Interestingly, this inference is apparently cancelable in particular contexts. Consider (15) (cf. Anand & Hacquard 2014: 74).

(15) Jan Jan udowodnił proved.pfv Basi, Basia.dat że that Marek Marek boi fears.ipfv się refl duchów, ghost.pl jednak but Krzysiek Krzysiek w in to this wątpi. doubts 'Jan proved to Basia that Marek fears ghosts, but Krzysiek doubts that.' ↛ Marek fears ghosts.

All the predicates in (14) allow an overt experiencer, which makes veridicality questionable. (15) says that Jan succeeded in convincing Basia that Marek fears ghosts, but he did not manage to convince Krzysiek. As a result, the lexical entry of the matrix predicate corresponds more to *convince* than to *prove*.

The 'weak entailment' from (14) does not project under negation or after the insertion of a modal adverbial, which can be seen in (16).

(16) Jan Jan nie neg / prawdopodobnie probably udowodnił proved.pfv / wykazał revealed.pfv / pokazał, showed.pfv że that Marek Marek boi fears.ipfv się refl duchów. ghost.pl 'Jan did not prove / reveal / show / probably proved / revealed / showed that Marek fears ghosts.' ↛ Marek fears ghosts.

Example (16) only says that Jan did not succeed / that Jan probably succeeded in providing arguments for Marek's fear of ghosts, but it leaves it open whether the complement sentence is true or not.

We have just seen that the weak truth entailment in the case of perfective verbs of proving can disappear in particular contexts, especially after an overt realization of an experiencer. Furthermore, the significance or trustworthiness of the authority also plays a role in acknowledging a complement proposition as

### Karolina Zuchewicz

veridical. No projection pattern can be observed, which means that we are not dealing with a presupposition here.

Particular imperfective forms lack any kind of truth-contributing potential. Consider example (17) for affirmative sentences.

(17) Jan Jan udowadniał proved.ipfv / wykazywał revealed.ipfv / pokazywał, showed.ipfv że that Marek Marek boi fears.ipfv się refl duchów. ghost.pl 'Jan was proving / revealing / showing that Marek fears ghosts.' ↛ Marek fears ghosts.

Example (17) asserts that Jan was trying to prove / reveal / show that Marek fears ghosts, but it does not make any statement about the final results of Jan's investigations. As expected, no truth inference can be found under negation or after the addition of a modal adverbial, which can be seen in (18).

(18) Jan Jan nie neg / prawdopodobnie probably udowadniał proved.ipfv / wykazywał revealed.ipfv / pokazywał, showed.ipfv że that Marek Marek boi fears.ipfv się refl duchów. ghost.pl 'Jan was not / probably proving / revealing / showing that Marek fears ghosts.' ↛ Marek fears ghosts.

Example (18) demonstrates possible modifications of the likelihood of Jan having tried to prove / reveal / show that Marek fears ghosts. No contribution to the truth-related meaning of the complement sentence can be observed. Another member of this group is for instance *przekonać*, *przekonywać* 'convince'.

### **4.3 Case 3: Truth implicature**

Truth implicature refers especially to the perfective communication verbs, which differ from their imperfective counterparts in that the former, but not the latter, entail the complete realization of all parts of the speech act. Austin (1962) defines a speech act as consisting of three partial acts. The first one, a locutionary act, is the act of uttering itself. The second one, an illocutionary act, affects the area of the speaker's intention. Finally, a perlocutionary act describes an actual effect the particular speech act had on the hearer. A speech act is presumed to

21 How factive is the perfective?

be completely realized only if all three parts have been fulfilled. In Polish, perfective communication verbs, in contrast to imperfective ones, enforce complete fulfillment of all parts of the speech act, as example (19) illustrates.<sup>4</sup>

(19) Iza Iza właśnie just go him o about tym that #poinformowała informed.pfv / informowała, informed.ipfv ale but przerwał interrupted jej her w in pół middle słowa. word 'Iza has just informed / was just informing him about that, but he interrupted her in the middle of the sentence.'

Only *poinformowała* entails that the hearer received the information.

### **5 Perfectivity-dependent truthfulness**

First of all, a short note on telicity should be made. My object of investigation is embedding predicates, which are transitive verbs. They all require a direct object, realized as a sentential complement; for the purpose of my analysis, I consider that-clause a definite argument. For this reason, the whole complex predicate receives a telic interpretation, independently of the (im)perfective marking on the verb. The truth inference is present when the matrix predicate has the features [+telic, +perfective], and it is absent when the matrix predicate has the features [+telic, –perfective].

Based on the influence of aspect on the interpretation of nominal arguments, I also assume a dependency between aspect and a propositional argument. The aspectual operator PFV introduces a further undefined truthfulness feature, which is specified as factive, veridical or reliable via the dependency between the truth of *p* (where *p* stands for the proposition expressed by the that-clause) and an event *e* described by the matrix verb. For now, the three truthfulness-realizations can be formalized as follows:<sup>5</sup>

	- a. PFV(*λe*.JVPK(*e*) such that the truth of *<sup>p</sup>* is independent of *<sup>e</sup>*) → *p* is factive

<sup>4</sup> I would like to thank Manfred Krifka for inspiring this idea.

<sup>5</sup>The operations are based on the semantics of the perfective and not on the formation patterns. In future work, the morphology will be integrated into the semantic account (cf. Młynarczyk 2004).

### Karolina Zuchewicz


Truth presupposition comes about when the truth of *p* is independent of the truth of *e*. Here, no incremental creation of belief can be observed. For example, the truth of propositions embedded under *zgadnąć* or *przewidzieć* holds independently of the process of guessing or predicting. In contrast, the truth of propositions embedded under *udowodnić*, *wykazać* or *pokazać* does depend on the result of the proving-process; we have an incremental creation of belief. This explains why the authority of an experiencer or its overt realization are crucial for judging complement sentences as veridical. In the case of truth implicature, the truth of *p* is 'only' communicated by *e*.

The question remains whether 'being reliable' should be considered a feature at all, or if it should be labeled as 'no feature present'. In the latter case, truthfulness set up by the perfective operator would remain unrealized if the inference was an implicature. Another open question concerns the role of morphology in determining the strength of the inference. It seems that perfective underlying forms tend to enforce factive meaning of the proposition expressed by the subordinate clause. Additionally, verb semantics and argument structure may also be taken into consideration, since specifying an experiencer can influence the entailment pattern. In general, the semantic type of the matrix verb could be used to distinguish between different verb classes and to establish a more fine-grained truthfulness scale. All this will be the subject of further investigations.

In the last section of this paper I will briefly discuss the inherently factive imperfectives and their perfective counterparts. It will be shown that they constitute a unique group with factivity being an aspect-independent, lexical property of the root form, which automatically projects to the perfective derivate.

### **6 Remark on inherently factive imperfectives**

As has been mentioned before, inherently factive imperfectives (for example emotive factives) require their complements to be true. Consider example (21).

(21) Ania Ania cieszyła was.happy.ipfv / ucieszyła was.happy.pfv się, refl że that idzie comes.ipfv lato. summer 'Ania was happy / started being happy about the fact that the summer was coming.'

≫ The summer was coming.

### 21 How factive is the perfective?

The only difference between *cieszyła* and *ucieszyła* lies in the marking of the beginning of a state in the case of the latter. The underlying imperfective form is inherently factive (lexical factivity), so it remains factive when perfectivized. In the case of inherently factive imperfectives the perfectivizing operation leads to the marking of a temporal boundary of an event, but it does not enforce or change the truth inference of the proposition from the embedded clause (see also §1). This pattern needs to be distinguished from the ones discussed in §4 and §5, where the truth inference ascribed to the perfective was absent in the particular imperfective forms. Other inherently factive imperfectives are *rozumieć* 'understand' and *kapować* 'get'.

### **7 Conclusion**

In this paper, I demonstrated three kinds of perfectivity-dependent truth inferences in Polish: truth presupposition, truth entailment and truth implicature. In the case of truth presupposition, the proposition from the embedded clause receives a factive interpretation. The inference remains under negation or after the insertion of a modal adverbial. In the case of truth entailment, a veridical interpretation of the complement sentence can be observed; only the positive sentence is interpreted as true. In the case of truth implicature, the inference in question is neither factivity nor veridicality. It is due to a pragmatic principle giving preference to the perfective verb if the speaker assumes that the sentence subject is reliable (speaker commitment to the truth of *p*).

Despite the differences in the strength of particular inferences, the truthfulness of the proposition from the embedded clause is only due to perfectivity – it is absent with imperfective forms. Embedding by imperfective matrix verbs results in the occurrence of a neutral interpretation of a that-clause with respect to its truthfulness, provided that the embedding imperfective verb is not inherently factive. The aspectual operator PFV introduces a truthfulness feature, which is realized as factive, veridical or reliable depending on the relation between the truth of the proposition expressed by the embedded clause and an event described by the matrix verb.

The question remains as to how truthfulness interacts with perfectivity itself. In the case of communication verbs, the completedness condition of the perfective enforces the complete performance of the speech act denoted by the matrix verb. The speaker of the sentence chooses the perfective if she considers the speaker of the speech act reliable. As a result, the proposition expressed by the that-clause is understood to be true. In the case of verbs of proving, the com-

### Karolina Zuchewicz

pletedness effect of the perfective interacts with the incrementality, which is a part of lexical verb semantics. A proof is a proof after its final step is completed. For verbs of guessing, the truth presupposition is triggered in combination with the integration of the proposition 'someone guessed something' into the common ground. The speaker uses the perfective in order to demonstrate that the guessing event has been completely realized. The cooperative hearer accepts the proposition as true, which triggers the presupposition rooted in the lexical verb semantics.<sup>6</sup>

In future work, a detailed study with different semantic groups of verbs will be conducted. In addition the type of embedding is to be controlled for, since it may be involved in determining the strength of the inference available. An interesting observation concerns perfective verbs of saying which embed wh-phrases; they seem to function as exhaustivity triggers. Thus, exhaustivity could also be used to make the truthfulness scale more fine-grained.

### **Abbreviations**


### **Acknowledgements**

I would like to thank Luka Szucsich, Manfred Krifka and Radek Šimík. I thank Berit Gehrke, Kyle Johnson, Dara Jokilehto, Denisa Lenertová, Clemens Mayr, Roland Meyer, Brandon Waldon, Ilse Zimmermann and audiences from ZAS, HU Berlin, ELTE and MTA. I am also grateful to the anonymous reviewers for their helpful comments. Finally, I thank Kirsten Brock and Jake Walsh for correcting my English. All remaining errors are my own.

This work was supported by the German Bundesministerium für Bildung und Forschung (BMBF) (Grant No. 01UG1411).

### **References**

Anand, Pranav & Valentine Hacquard. 2014. Factivity, belief and discourse. In Luka Crnič & Uli Sauerland (eds.), *The art and craft of semantics: A festschrift for Irene Heim*, vol. 1, 69–90. Cambridge: MIT Working Papers in Linguistics.

<sup>6</sup> I would like to thank one of the anonymous reviewers for inspiring this idea.

21 How factive is the perfective?


### Karolina Zuchewicz


Abels, Klaus, 362 Abney, Steven, 342, 417 Abraham, Werner, 72 Abulizi, Abudoukelimu, 428 Abusch, Dorit, 458, 466, 468, 473 Acquaviva, Paulo, 327 Adam, Nina, 174, 175 Adger, David, 147, 148, 150, 158, 164, 165 Akamatsu, Tsutomu, 118 Alexiadou, Artemis, 30, 32, 36, 59 Aljović, Nadira, 344 Alsina, Aleks, 313 Altshuler, Daniel, 456 Alvestad, Silje, 289 Al'muhamedova, Z. M., 121 Anand, Pranav, 487 Anderson, Stephen, 161 Anstatt, Tanja, 130 Antonenko, Andrei, 223 Antonyuk, Svitlana, 373 Aoun, Joesph, 360 Apresjan, Jurij D., 131 Arregi, Karlos, 365 Arregui, Ana, 289 Arsenijević, Boban, 313–315 Artstein, Ron, 17 Asarina, Alya, 218, 222, 223, 231, 235 Austin, John L., 130, 488 Avanesov, Ruben I., 110, 111, 114–116 Avilova, Natal'ja S., 130

Avrutin, Sergey, 386, 392 Baayen, R. Harald, 434, 435 Babby, Leonard H., 53, 55 Babyonyshev, Maria, 387, 391 Bacskai-Atkari, Julia, 3–5, 19 Bailyn, John F., 149, 231, 402 Baker, Mark C., 340, 357–359, 362, 370 Baltin, Mark, 2, 4 Bandi-Rao, Shoba, 339 Barentsen, Adrian, 305, 307 Barker, Chris, 417, 418 Barnetová, Vilma, 233 Barrie, Michael, 316 Bartošová, Jitka, 179 Bartula, Czesław, 255 Bates, Douglas, 139, 434 Bayer, Josef, 2, 3, 5, 6, 194, 197, 368 Bech, Kristin, 1 Beck, Sigrid, 25–28, 30–33, 39 Béjar, Susana, 170 Belić, Bojan, 156 Benjamini, Yoav, 435 Berger, Tilman, 289 Bertram, Raymond, 428, 440, 443 Bhatia, Tej K., 339 Bhatt, Rajesh, 78, 178, 179, 484 Bianchi, Valentina, 4 Bittner, Maria, 357, 359, 362 Bloom, Paul, 387 Blutner, Reinhard, 30

Bobaljik, Jonathan D., 226, 315–318, 326, 359, 370, 405 Boersma, Paul, 285 Bondarko, Aleksandr V., 130, 132 Boneh, Nora, 43, 48 Bonet, Sebastià, 417 Borik, Olga, 54, 60, 484 Bosch, Sina, 430 Bošković, Željko, 8, 16, 17, 161, 162, 178, 208, 245–247, 251, 337– 341, 343, 346, 361, 368, 402, 404 Brandner, Ellen, 2, 5 Brecht, Richard, 53, 55, 222, 233 Breu, Walter, 130, 132, 143, 289 Broekhuis, Hans, 251, 255 Browne, Wayles, 149 Brozović, Dalibor, 273 Brucart, Josep M., 417, 418 Bruening, Benjamin, 32, 48, 360 Brysbaert, Marc, 435 Bryzgunova, Elena, 416 Bucci, Jonathan, 121 Bukrinskaja, Irina A., 115 Büring, Daniel, 415 Butt, Miriam, 365 Cable, Seth, 466 Caha, Pavel, 432, 440, 444 Campos, Hector, 382 Cardinaletti, Anna, 387, 418 Carman, J. N, 123 Carmona, Jaqueline, 387 Carvalho, Joaquim, 120 Castilla, Anny P., 387 Ćavar, Damir, 251 Ćavar, Małgorzata, 271–274, 278, 282–285 Cheng, Lisa, 202

Chierchia, Gennaro, 121 Chomsky, Noam, 2, 17, 148, 169, 178, 244, 341, 357, 362, 384 Chumakina, Marina, 405 Chung, Sandra, 196 Cinque, Guglielmo, 210 Clahsen, Harald, 430, 432 Cohen, Ariel, 482 Comrie, Bernard, 71, 130 Cooper, Robin, 325, 326, 330 Corbett, Greville, 149, 171, 188, 313, 315 Costa, João, 387, 394 Cowper, Elizabeth, 219, 224, 226– 230, 232, 235 Cresswell, Max, 466 Crosswhite, Katherine, 109, 121, 122 Cruschina, Silvio, 4 Cubberley, Paul, 220, 221, 223 Cuervo, María, 26, 47 Cummins, George, 291, 296, 297, 303 Cummins, Sarah, 381, 383–385, 394 Dahl, Östen, 289 Dalrymple, Mary, 170 Damborský, Jiří, 255 de Hoop, Helen, 417 De Vries, Mark, 315 De Wit, Astrid, 132, 143 Delsing, Lars-Olof, 418 den Dikken, Marcel, 339 Despić, Miloje, 314, 329, 344 Dickey, Stephen M.,129,130,132,133, 289–291, 293, 294, 296, 297, 305–308 Diependaele, Kevin, 435 Diesing, Molly, 161, 162 Dimitriadis, Alexis, 382

Dimitrova-Vulchanova, Mila, 241, 243, 248, 254 Disner, Sandra, 265, 266 Dobrovie-Sorin, Carmen, 342 Dočekal, Mojmír, 78 Doetjes, Jenny, 79, 99–101, 418 Donazzan, Marta, 78 Dostál, Antonín, 255 Dresher, B. Elan, 109 Dryer, Matthew S., 241 Dübbers, Valentin, 289 Dvořák, Věra, 303, 382 Eckardt, Regine, 128, 131 Egré, Paul, 479, 482 Elbourne, Paul, 472 Embick, David, 219, 226 Emonds, Joseph E., 1, 384 Enguehard, Guillaume, 121, 122 Faarlund, Jan Terje, 1 Fabricius-Hansen, Cathrine, 30 Farkaş, Donka, 170, 178 Fehrmann, Dorothee, 54, 72, 381 Feldman, Laurie B., 428 Fellbaum, Christiane, 384 Fennig, Charles, 269 Filip, Hana, 483, 484 Filipović Đurđević, Dušica, 429, 439 Fischer, Olga, 242 Forsyth, James, 64, 132 Fortuin, Egbert, 289, 296, 302, 307 Fowler, Carol A., 428 Franks, Steven, 246, 247, 340, 371, 381, 401, 402, 418, 419 Frey, Werner, 360 Frolova, Anna, 386, 392 Galton, Herbert, 130, 132

Garde, Paul, 110, 111 Gattnar, Anja, 289 Gawron, Jean Mark, 61 Gehrke, Berit, 59, 63, 289, 484 Geurts, Bart, 64 Giannakidou, Anastasia, 224, 382 Giusti, Giuliana, 418 Gluschenko, O. A, 115 Gobeski, Adam, 81, 92, 95 Golden, Marija, 11, 12, 207 Gonzales Velásquez, María Dolores, 338, 339 Gor, Kira, 428, 432, 433, 440, 443 Gordishevsky, Galina, 386, 392 Gračanin-Yuksek, Marina, 314, 315 Greenberg, Joseph H., 241 Gribanova, Vera, 47 Groefsema, Marjolein, 384 Grohmann, Kleanthes, 387 Grønn, Atle, 55, 62–65, 68, 71, 292, 298, 305–307 Gruet-Skrabalova, Hana, 8, 9, 197, 198 Grüter, Theres, 387 Guasti, Maria Teresa, 387, 391 Hacking, Jane F., 221, 232, 233 Hacquard, Valentine, 481, 484, 487 Hadfield, Jarrod D, 332 Haeberli, Eric, 242 Hagstrom, Paul, 197, 198 Haider, Hubert, 360 Hale, Ken, 357, 359, 362 Hale, Mark, 109 Halle, Morris, 218, 226, 327 Hamann, Cornelia, 387 Hamann, Silke, 266, 271–273, 278, 284, 285 Hamilton, William S., 109

Harizanov, Boris, 47 Harley, Heidi, 25, 26, 179, 327, 405 Harnish, Robert M., 131 Harrison, William, 217, 220, 221, 223 Hawkins, John A., 242 Heck, Fabian, 17 Heim, Irene, 92, 93, 177, 178, 188, 325, 458, 464, 466, 473 Hernanz, M. Lluïsa, 417 Heycock, Caroline, 170 Hjelmslev, Louis, 109, 121 Hladnik, Marko, 11–14 Hochberg, Yosef, 435 Hoekstra, Eric, 6 Hofweber, Thomas, 77 Höhle, Tilman, 360 Horn, Laurence, 94 Hornstein, Norbert, 457 House, Richard, 401, 418, 419 Hyönä, Jukka, 428 Iatridou, Sabine, 225, 230 Ingham, Richard, 394 Ionin, Tania, 77, 418 Israeli, Alina, 133–135, 143 Ivić, Pavle, 266 Jaeggli, Osvaldo A., 348 Jäger, Gerhard, 30 Jakubowicz, Celia, 387 Jespersen, Otto, 232 Johannessen, Janne Bondi, 171 Johnson, Kyle, 25–28, 30–32 Jongman, Allard, 274 Joppen, Sandra, 362 Jung, Hakyung, 243, 246, 250 Jung, Hyun Kyoung, 26 Jung, Yeun-Jin, 26 Junghanns, Uwe, 381

Kagan, Olga, 268 Kallestinova, Elena, 231 Kamp, Hans, 459 Kamphuis, Jaap, 289, 296, 302, 307 Kang, Yoonjung, 283 Kaplan, David, 466 Karlík, Petr, 171, 175, 187 Karttunen, Lauri, 479 Kaspar, Jiri, 8, 9 Katičić, Radoslav, 149 Katz, Leonard, 428 Kaye, Jonathan, 118 Kayne, Richard, 25, 26, 148, 244, 362, 408 Kegl, Judy, 384 Kelleher, Ann, 267 Kennedy, Christopher, 82, 91, 93 Keuleers, Emmanuel, 435 Khrizman, Keren, 78 Kiefer, Ferenc, 485 Kihm, Alain, 327 King, Tracy Holloway, 170, 246 Kiparsky, Carol, 479 Kiparsky, Paul, 357, 362, 479 Klein, Wolfgang, 64, 130 Knjazev, Jurij P., 55, 61 Knjazev, Sergej Vladimirovič, 121 Koivisto, Mika, 428 Kondrashova, Natalia, 234 Koopman, Hilda, 2, 362 Koschmieder, Erwin, 132 Kostić, Aleksandar, 428, 429, 432 Kramer, Ruth, 314, 315, 327 Krapova, Ilyana, 159 Kratzer, Angelika, 35, 73, 93, 325, 464 Křen, M., 79 Krifka, Manfred, 64, 95, 99, 298, 482– 484

Kroch, Anthony, 148, 242 Kučerová, Ivona, 179 Kul'sharipova, R. E., 121 Kupisch, Tanja, 268 Kuryłowicz, Jerzy, 306 Kuznetsova, Alexandra, 434 Labov, William, 148 Ladefoged, Peter, 265, 266, 272 Laine, Matti, 428 Lambova, Mariana, 251, 253, 254 Landau, Idan, 178, 314, 330 Landman, Fred, 77, 78 Larsen, Uffe B., 121 Larson, Richard, 4 Lasersohn, Peter, 99, 170 Lasnik, Howard, 2, 148 Łaziński, Marek, 133, 135 Le Fleming, Stephen, 217, 220, 221, 223 Lechner, Winfried, 30, 32, 36 Lee, Goun, 274 Legate, Julie Anne, 374 Lehmann, Volkmar, 130 Lema, José, 251 Lenth, Russell V, 435 Levin, Ted, 358, 362 Lewis, David, 466 Li, Yen-Hui Audery, 360 Lillo-Martin, Diane, 387, 394 Lindseth, Martina, 381 Lipták, Anikó, 327 Lobo, Maria, 387 Lomashvili, Leila, 26 Longobardi, Giuseppe, 178 Lopes, Ruth, 387, 394 Lukatela, Georgije, 428 Lyashevskaya, Olga N., 435 Łyskawa, Paulina, 284

MacSwan, Jeff, 338 Maddieson, Ian, 272 Maling, Joan, 367 Marantz, Alec, 218, 226, 327, 357– 359, 403, 405 Marco, Cristina, 59 Marelj, Marijana, 371 Marin, Stefania, 387, 391 Marinis, Theodoros, 387 Martí i Girbau, M. Núria, 418 Marušič, Franc, 4, 11, 12, 170, 193–197, 201, 203–209, 212 Maslov, Jurij S., 62, 130, 307 Mateu, Victoria Eugenia, 387 Mathesius, Vilém, 55 Matushansky, Ora, 77, 330 McCoy, Svetlana, 197, 198 McFadden, Thomas, 357, 359, 367– 369, 405 McIntyre, Andrew, 26, 39, 47 McNally, Louise, 82 Mehlig, Hans Robert, 62, 63, 130, 292 Mel'čuk, Igor', 409 Melara, Emilia, 228, 234 Merchant, Jason, 4, 193–196, 203, 204, 313, 315, 316, 318, 324– 327, 331, 332, 382 Mezhevich, Ilana, 217–222, 234 Migdalski, Krzysztof, 243, 245–251, 255 Miletić, Branko, 270 Milin, Petar, 429, 435, 439 Milner, Jean-Claude, 417 Mišeska-Tomić, Olga, 156 Mišmaš, Petra, 8, 12, 200 Mitrović, Moreno, 197 Mittwoch, Anita, 306 Miyagawa, Shigeru, 26

Młynarczyk, Anna, 489 Morzycki, Marcin, 92, 95, 98 Moscoso del Prado Martín, Fermín, 429, 439 Mueller-Reichau, Olav, 63, 306 Müller, Gereon, 381, 432, 433, 440– 443 Munn, Alan Boag, 178 Muysken, Pieter, 338 Myers, James, 388 Mykhaylyk, Roksalana, 386, 392, 393 Nagy, Naomi, 283 Narita, Hiroki, 178 Nash, Léa, 43, 48 Neokleous, Theoni, 387 Nevins, Andrew, 178, 365 Niemi, Jussi, 428 Nikolaeva, Liudmila, 188 Noyer, Rolf, 219, 226, 327 Nunes, Jairo, 316, 318 Obenauer, Hans-Georg, 194, 197 Ogihara, Toshiyuki, 456, 464–466 Opitz, Andreas, 430, 442 Orešnik, Janez, 207, 208 Otheguy, Ricardo, 268 Ott, Dennis, 210 Padgett, Jaye, 109 Padučeva, Elena V., 62, 63, 130, 131, 292, 299 Pancheva, Roumyana, 78, 92, 93, 241, 243–256, 259 Panevová, Jarmila, 171, 173, 174 Parmenter, C. E., 123 Partee, Barbara H., 464 Paslawska, Alla, 55, 60 Penke, Martina, 430, 442

Percus, Orin, 466 Pérez-Leroux, Ana Teresa, 383, 385, 387, 389 Pesetsky, David, 26,169,170, 313, 330, 402, 409, 413, 416 Peti-Stantić, Anita, 149 Petinou, Kakia, 387 Petkevič, Vladimír, 171, 173, 174 Petroj, Vanessa, 339, 342–346, 348 Petruchina, Elena V., 130, 131, 289 Picallo, M. Carme, 224, 330 Pintzuk, Susan, 242 Plungjan, Vladimir A., 289 Polinsky, Maria, 267–269, 283, 284 Pollock, Jean-Yves, 362 Poplack, Shana, 338 Preminger, Omer, 170, 180, 358, 359, 362 Progovac, Ljiljana, 365, 371, 375 Pshekhotskaya, Ekaterina, 44 Puškar, Zorica, 329, 331 Pylkkänen, Liina, 25, 26 Quer, Josep, 224 Radanović-Kocić, Vesna, 245, 247, 248 Radeva-Bork, Teodora, 386, 389, 390, 392 Ramchand, Gillian, 228, 229, 235 Ramos, Joan Rafel, 418 Rapp, Irene, 32, 33 Rassudova, Ol'ga P., 130 Rathmayr, Renate, 132, 135 Reiss, Charles, 109 Rett, Jessica, 96, 97 Reyle, Uwe, 459 Rezac, Milan, 170 Rigau, Gemma, 418

Rigaut, Catherine, 387 Riqueros, José, 340 Ritchie, William C., 339 Ritter, Elizabeth, 179, 188, 218, 225, 226, 228, 230 Rivero, María-Luisa, 251, 255, 256, 289 Rizzi, Luigi,19,194,195, 228, 233, 234, 384 Rjabceva, Nadežda K., 133 Roberge, Yves, 381, 383–385, 394 Roelofsen, Floris, 188 Ronelle, Alexander, 161 Rooryck, Johan, 202 Ross, John R., 194, 196 Rothstein, Susan, 77 Rudin, Catherine, 8 Rullmann, Hotze, 93 Saab, Andrés, 327 Samojlova, Maria, 432, 435, 439, 441– 443 Sauerland, Uli, 199, 325, 466 Schaeffer, Jeannette, 387, 389, 391, 392 Schäfer, Florian, 26, 72 Schane, Sanford A., 118, 120 Scheer, Tobias, 121 Schlenker, Philippe, 224, 481 Schoorlemmer, Maaike, 54, 55, 60 Schulz, Katrin, 95 Schütze, Carson, 403, 405 Schwabe, Kerstin, 9 Schwarzschild, Roger, 88, 91–93 Scontras, Gregory, 267, 268 Ségéral, Philippe, 121 Ševa, Nada, 429 Sharoff, Sergej, 435 Sharvit, Yael, 456, 464, 466

Sheppard, Milena Milojević, 12, 207 Sigurðsson, Halldór Ármann, 405 Silva, Carolina, 387 Simons, Gary, 269 Škarić, Ivo, 270 Slabakova, Roumyana, 231 Sławski, Franciszek, 249, 250 Sleeman, Petra, 418 Slioussar, Natalia, 432, 435, 439, 441– 443 Šmelev, Aleksej D., 130 Smith, Jennifer, 147, 148, 150, 158 Smith, Peter, 314 Snoj, Marko, 201 Snyder, William, 39 Solà, Joan, 417 Solt, Stephanie, 92 Sonnenhauser, Barbara, 298, 307 Sopata, Aldona, 386, 392, 393 Spathas, Giorgos, 316, 318, 325, 326 Spencer, Andrew, 218, 221 Stankiewicz, Edward, 266 Stanković, Branimir, 344 Starke, Michal, 387 Stephany, Ursula, 387 Sternefeld, Wolfgang, 368 Stiasny, Andrea, 386, 387, 391, 392 Stiebels, Barbara, 357, 362 Stipčević, Balša, 369, 370, 373 Stjepanović, Sandra, 360, 361 Strawson, Peter, 480 Struckmeier, Volker, 210 Stunová, Anna, 289, 304, 305, 307 Sudo, Yasutada,177,178, 316, 318, 325, 326 Sussex, Roland, 409 Švedova, Natalija J., 132 Švedova, Natalija Ju., 55

Svenonius, Peter, 40, 228, 229, 235 Svetozarova, Natalija Dmitrievna, 121 Swan, Oscar, 130 Talić, Aida, 344 Tatevosov, Sergei, 35, 36, 40, 301, 307 Tedeschi, Roberta, 387, 391 Terzi, Arhonto, 387 Thepboriruk, Kanjana, 283 Titov, Elena, 361, 416, 422 Todorović, Dejan, 428 Todorović, Nataša, 156 Todorović, Neda, 156, 159, 160, 163 Toman, Jindřich, 259 Tomić, Olga Mišeska, 248, 249 Toporišič, Jože, 207 Torrego, Esther, 169, 170 Trousdale, Graeme, 147, 148, 150 Tryzna, Marta, 386, 392, 393 Tsakali, Vina, 387 Uhmann, Susanne, 360 Ungureanu, Manuela, 342 Uriagereka, Juan, 338 Vaillant, André, 257 Van Craenenbroeck, Jeroen, 195 Van der Sandt, Rob, 64 van Gelderen, Elly, 2, 5, 6 van Hout, Angeliek, 384 van Rooij, Robert, 95 Varlokosta, Spyridoula, 384, 389, 394 Vasilyeva, Maria D., 431 Večerka, Radoslav, 257 Velnić, Marta, 361 Veltman, Calvin, 269, 283 Vennemann, Theo, 241 Vicente, Luis, 204

Vinokurova, Nadya, 357, 359, 362 von Fintel, Kai, 100 von Stechow, Arnim, 25, 30, 32, 33, 55, 60, 92, 93, 360, 464, 466 Vulchanov, Valentin, 241, 243, 248, 254 Vysotskij, S. S., 121 Wągiel, Marcin, 78, 87, 99 Walkden, George, 1 Walkow, Martin, 178, 179 Wang, Qi, 387, 394 Wechsler, Stephen, 313, 329 Wedel, Andrew, 273 Weenink, David, 285 Wexler, Kenneth, 387, 391 Wiemer, Björn, 133, 135, 142, 289, 293 Wierzbicka, Anna, 483 Wiese, Bernd, 431–433, 440, 442, 443 Wilder, Chris, 251 Wilkinson, Karina, 91, 93 Willis, David, 243, 252, 255, 257 Wiltschko, Martina, 188, 218, 219, 224–226, 228–230 Wood, Jim, 359, 370 Wunderlich, Dieter, 357, 362, 430, 432, 440–443 Wurmbrand, Susanne, 156, 159, 160, 163 Xu, Ting, 47 Yadroff, Michael, 409, 410, 421 Yatsushiro, Kazuko, 199 Yip, Moira, 357, 405 Yu, Kristine, 161 Zaenen, Annie, 366, 405 Zaliznjak, Anna A., 130

Zaliznyak, Andrey, 434

Zamparelli, Roberto, 170 Žaucer, Rok, 208 Zec, Draga, 170, 178 Zimmermann, Malte, 10, 197, 210 Zlatić, Larica, 313, 329 Zlatoustova, L. B, 121 Zocca, Cynthia Levart, 315–318, 326 Żygis, Marzena, 266

## **Language index**

Abkhaz, 272

Baltic, 105 Basque, 365 BCS, 208, 265, 266, 266<sup>1</sup> , 267, 269, 271, 272<sup>7</sup> , 273, 275<sup>10</sup> , 283, 286, 314, 315, 317, 320, 326, 332, 360 Belarusian, 290, 291 Bohemian, 171<sup>4</sup> , 188 Bosnian, 265, 267, 273<sup>8</sup> , 277, 279, 284, 285, 320 Bosnian/Croatian/Serbian, *see* BCS Brazilian Portuguese, 315, 317, 382, 384, 388, 390, 391, 393–395 British English, 283 Bulgarian, 243–256, 259, 340, 382, 384–386, 388–390, 392, 393, 395 Čakavian, 266, 272 Catalan, 224, 384, 388, 390, 391, 393, 395, 418 Chinese, 47<sup>13</sup> , 272, 340<sup>3</sup> , 382, 384, 386, 388, 390, 391, 393–395 Croatian, 148–151, 153–155, 162–165, 201, 245<sup>2</sup> , 246, 249, 250, 256, 265, 267, 270–273, 277– 279, 284, 285, 290, 320, 360<sup>1</sup> , 361<sup>2</sup> , 365, 382, 384, 386, 388–390, 392, 393, 395, 413<sup>15</sup>

Czech, 7, 8, 8 9 , 9, 9 11 , 10, 10<sup>13</sup> , 11, 11<sup>17</sup> , 12–17, 19, 20, 72, 73, 78, 79, 79<sup>2</sup> , 81, 83, 84, 84<sup>9</sup> , 85, 86, 87<sup>12</sup> , 91–93, 96, 97, 99, 100, 104, 105, 133, 149, 170, 170<sup>2</sup> , 171<sup>5</sup> , 174, 175, 176<sup>11</sup> , 180, 186– 188, 197, 197<sup>7</sup> , 198, 201, 246, 258, 259, 290–294, 296–301, 303–307, 382, 483, 485<sup>3</sup>

Danish, 4 2


Finnish, 340, 428, 443 French, 56, 78, 99, 202, 383<sup>1</sup> , 384, 386, 388, 391, 393, 395, 484

Georgian, 382 German, 1 1 , 3, 5 3 , 7 8 , 10<sup>15</sup> , 14<sup>23</sup> , 19, 20, 33, 33<sup>8</sup> , 34, 35<sup>9</sup> , 59, 59<sup>5</sup> , 60,

### Language index

149, 197, 207, 210, 210<sup>15</sup> , 224, 225, 242, 271, 273, 367, 368, 369<sup>6</sup> , 382, 430, 492 Germanic, 1, 1 1 , 2, 3, 5, 7, 10, 10<sup>13</sup> , 11, 15–20, 243 Greek, 315, 316, 325, 326, 340, 382, 384, 386, 388, 390, 393, 395, 484<sup>2</sup> Hebrew, 456, 456<sup>2</sup> Hindi, 340, 484<sup>2</sup> Hungarian, 78, 485 Icelandic, 224, 359, 366, 370 Ijekavian, 149 Irish, 4 2 Italian, 123, 384, 388, 390–393, 395 Kajkavian, 266, 272 Latin, 276, 320, 340 Macedonian, 247, 340 Moravian, 171<sup>4</sup> , 187, 188 Polish, 78, 84<sup>9</sup> , 132, 133, 246<sup>3</sup> , 250, 258, 259, 265, 266<sup>1</sup> , 271, 272, 284, 285, 285<sup>14</sup> , 290–300, 300<sup>10</sup> , 304, 306–308, 382, 384, 386, 388–395, 479, 482– 485, 489, 491 Romance, 340 Romanian, 338, 339, 341, 342, 342<sup>5</sup> , 343, 344, 344<sup>8</sup> , 345–348, 350, 350<sup>13</sup> , 351–353, 384, 388, 391–393, 395 Russian, 25, 25<sup>1</sup> , 27, 27<sup>2</sup> , 27<sup>3</sup> , 28, 29, 31<sup>6</sup> , 32, 34, 35, 35<sup>9</sup> , 37, 39, 40, 40<sup>11</sup> , 41, 42, 44, 47,

47<sup>13</sup> , 48, 53–56, 60, 61, 64<sup>8</sup> , 67<sup>9</sup> , 71–73, 78, 84<sup>9</sup> , 109–114, 117, 119–122, 122<sup>10</sup> , 123, 128– 130, 130<sup>2</sup> , 131–133, 135, 136, 136<sup>6</sup> , 143, 197, 198, 198<sup>8</sup> , 217– 221, 226, 228, 228<sup>3</sup> , 229– 231, 233–238, 246<sup>3</sup> , 250, 258<sup>6</sup> , 259<sup>7</sup> , 290–294, 296, 300<sup>10</sup> , 304–308, 361<sup>2</sup> , 382, 384, 386, 388–393, 395, 401, 402<sup>1</sup> , 403, 403<sup>3</sup> , 404, 404<sup>4</sup> , 405, 405<sup>4</sup> , 405<sup>5</sup> , 406, 406<sup>7</sup> , 411, 413<sup>15</sup> , 413<sup>16</sup> , 414, 416, 416<sup>18</sup> , 419, 422, 428, 431– 433, 435<sup>2</sup> , 439–444, 455, 456, 456<sup>1</sup> , 456<sup>2</sup> , 456<sup>3</sup> , 457– 461, 464, 465, 470, 474, 475 Serbian, 149, 150, 153, 154, 160–165, 245, 265, 267, 273<sup>8</sup> , 277– 279, 284, 285, 290, 320, 338– 340, 340<sup>4</sup> , 341–344, 344<sup>8</sup> , 345–348, 350, 350<sup>13</sup> , 351– 353, 360–362, 368, 368<sup>5</sup> , 370, 373<sup>8</sup> , 375, 428, 429, 439, 443 Slavic, v, vi, 3, 15–20, 53, 54, 72, 72<sup>12</sup> , 78, 78<sup>1</sup> , 92, 96, 165, 187, 188, 197, 201, 243, 245, 245<sup>2</sup> , 246–249, 251, 252, 255, 256, 259, 259<sup>7</sup> , 265, 266<sup>2</sup> , 289–

291, 307, 308, 340, 381, 383, 386, 388, 390, 392, 393 Slavonic, 201, 243, 247, 248, 250–252, 254, 254<sup>4</sup> , 255–257, 259 Slovak, 133, 246

Slovene, 290

Slovenian, 4 2 , 8 10 , 8 9 , 11, 11, 12<sup>17</sup> , 12<sup>18</sup> , 13, 14, 14<sup>21</sup> , 15, 16, 18– 20, 84<sup>9</sup> , 194, 194<sup>1</sup> , 194<sup>3</sup> , 195–

Language index

197, 197<sup>7</sup> , 198, 198<sup>8</sup> , 200, 201, 203–210, 210<sup>15</sup> , 211, 212, 246, 250 Spanish, 59, 224, 268, 283, 340, 348<sup>12</sup> , 382, 384, 388, 390, 391, 393, 395 Štokavian, 266, 272 Swahili, 382 Turkish, 149

Ubykh, 272 Ukrainian, 291, 384, 386, 388, 390– 393, 395 Uyghur, 428, 443

accusative, 27, 27<sup>2</sup> , 35, 37, 38, 43, 45, 157, 208, 244, 247, 248, 348<sup>12</sup> , 358, 359, 362, 365, 371<sup>7</sup> , 373, 375, 435, 442, 443 adjectival, 55, 56, 58–61, 65, 71–73, 78, 78<sup>1</sup> , 84, 84<sup>9</sup> , 85–89, 97–99, 105, 174, 185, 186, 323, 330, 332, 428, 430, 441 adjective, 87<sup>12</sup> , 92, 96, 98, 151, 176, 186, 242, 315, 318, 329, 332, 340<sup>3</sup> , 341, 342, 342<sup>6</sup> , 343, 345, 345<sup>9</sup> , 346, 347, 403<sup>3</sup> , 409<sup>11</sup> , 413, 413<sup>16</sup> , 414, 430 adverb, 27, 32–35, 37<sup>10</sup> , 39, 44, 47, 47<sup>13</sup> , 100, 197–199, 201, 210, 211, 246, 254 adverbial, 61, 78<sup>1</sup> , 84, 85, 89, 98, 254, 457, 458, 460, 472, 474, 475 affricate, 272, 273, 280 agreement, v, 148, 163, 170, 170<sup>1</sup> , 171, 171<sup>5</sup> , 172–174, 174<sup>9</sup> , 175, 175<sup>10</sup> ,176–179,181–188,199<sup>9</sup> , 225, 227, 230, 231, 313, 314, 315<sup>1</sup> , 318, 319, 322–324, 326<sup>6</sup> , 327–330, 332, 340<sup>3</sup> , 345, 345<sup>9</sup> , 382, 405, 412, 413, 413<sup>14</sup> , 414, 416, 416<sup>19</sup> , 419, 421<sup>22</sup> , 422, 461 anaphora, 64, 69, 70 animate, 38, 170–174, 174<sup>9</sup> , 175, 175<sup>10</sup> , 176<sup>11</sup> , 179, 180, 182, 183, 188,

392, 394, 421 applicative, 26, 32<sup>7</sup> , 44, 45, 47–49 archiphoneme, 117–120 aspect, 55, 62, 72, 129, 132, 134, 143, 197–201, 211, 219, 229, 236, 289–291, 294, 298, 303–305, 307, 308, 480, 482, 489, 491 case assignment, 230, 358, 359, 363– 365, 369, 375 case feature, 358, 359, 362, 413<sup>16</sup> clitic, 9, 11, 12<sup>17</sup> , 151, 161, 200, 203, 204, 207, 231, 244, 245, 245<sup>2</sup> , 246–249, 252– 254, 258<sup>6</sup> , 375, 384, 385, 389, 394 clitic cluster, 149, 196, 207–209, 210<sup>14</sup> , 212 clitic placement, 151, 154, 155, 164, 207, 243, 248, 249 cliticization, 243–249, 258<sup>6</sup> comparative, 88, 90, 92, 96, 103, 290 complement clause, 222, 224, 234– 236, 456, 457, 464, 484 complementary distribution, 2, 9, 13, 86, 235, 371 complementizer, 2–4, 4 2 , 5, 5 5 , 5 6 , 9 11 , 10, 11, 12<sup>17</sup> , 12<sup>18</sup> , 14, 15, 18, 18<sup>26</sup> , 20, 96, 156, 197, 197<sup>7</sup> , 201, 207, 208, 222, 223, 232, 235, 247, 257

conjunct, 98, 171<sup>5</sup> , 173<sup>8</sup> , 174, 176, 179, 183, 185, 186, 233, 319, 329, 332, 469 conjunction phrase, 176–186 consonant, 111, 112, 114, 120, 285<sup>14</sup> contrastive, 361, 414, 416, 422, 423 contrastive focus, 162, 198, 202, 203<sup>13</sup> , 233 dative, 25, 27, 27<sup>2</sup> , 29, 35, 37– 39, 41–49, 152<sup>2</sup> , 208, 223, 231, 247, 359, 360, 363–366, 368<sup>5</sup> , 369, 369<sup>6</sup> , 370, 372– 375, 433, 435, 438, 442 definite article, 342<sup>6</sup> , 343, 345, 346 dialect continuum, 148, 149, 151, 161, 164 direct object, 26, 28, 32<sup>7</sup> , 38, 44–46, 49, 241, 348, 360, 382, 389, 392, 394, 483, 489 *see also* DO discourse particle, 193, 197<sup>7</sup> , 199, 202 ditransitive, 26, 27, 29–32, 32<sup>7</sup> , 35, 37–39, 41–43, 47, 348, 350<sup>13</sup> , 369, 370, 373, 375 DO, 26, 348, 351, 353, 360–362, 365, 372, 375 *see also* direct object double object, 27, 42<sup>12</sup> , 350, 360 double object construction, 25, 26, 27<sup>2</sup> , 28<sup>4</sup> , 30, 32, 32<sup>7</sup> , 40, 47, 48, 375 echo interpretation, 10, 13–16, 20 ellipsis, 4 2 , 194, 196, 210<sup>15</sup> , 315, 316, 318, 325, 327, 328, 331, 332, 366 entailment, 329, 480, 481, 484, 484<sup>2</sup> ,

486, 487, 490

equative, 95, 96, 104 ergative, 358, 359, 362, 363, 365, 370, 374 experiencer, 150, 155, 162, 164, 487, 490 external argument, 159, 231, 364, 372, 373 factivity, 73, 479–482, 484, 485, 485<sup>3</sup> , 489–491 feature geometry, 178, 180, 219, 226, 227 feature resolution, 170, 171<sup>5</sup> , 172, 176, 177, 181, 186, 188 feminine, 151, 170, 171, 171<sup>3</sup> , 174, 175, 180, 182, 183, 314–316, 318, 319, 322–324, 326, 326<sup>6</sup> , 327, 329, 330, 332, 403<sup>3</sup> , 421<sup>22</sup> , 429, 431–433, 433<sup>1</sup> , 435, 437–439, 441–443 finite complementizer, 2, 5, 6, 16, 19, 20 functional load, 269, 273 future, 37<sup>10</sup> , 66, 71, 102, 129, 130, 132, 134, 135, 135<sup>5</sup> , 136, 138–140, 142, 160, 161, 188, 219, 222, 224, 235, 236, 258, 329, 338, 350<sup>13</sup> , 367<sup>4</sup> , 368<sup>5</sup> , 375, 404<sup>4</sup> , 459, 462, 464, 468<sup>9</sup> , 473, 482, 489<sup>5</sup> , 492 gender, 170–177, 180–185, 314, 315, 315<sup>1</sup> , 316–318, 323, 325–327, 329–332, 345<sup>9</sup> , 388, 421<sup>22</sup> , 433, 436, 436<sup>5</sup> , 437 genitive, 359, 367, 367<sup>4</sup> , 401, 402, 402<sup>1</sup> , 403, 403<sup>3</sup> , 404, 404<sup>4</sup> , 405–408, 411–414, 418, 419, 431, 433<sup>1</sup> , 437–443

grammatical gender, 314, 315, 318, 324, 326, 327, 330–332 head directionality, 241–243, 246, 249, 258<sup>6</sup> , 259 heritage language, 266–268, 268<sup>3</sup> , 269, 277, 283, 284 imperfective, 54–64, 64<sup>8</sup> , 65–73, 137, 219, 289, 290, 292, 292<sup>2</sup> , 293–296, 296<sup>6</sup> , 296<sup>7</sup> , 297– 300, 300<sup>10</sup> , 303, 304, 306, 306<sup>15</sup> , 307, 480–484, 486, 488, 489, 491 implicature, 94, 95, 298, 481, 482, 484, 488, 490 inanimate, 170, 171, 171<sup>3</sup> , 173, 175, 179, 180, 182, 297, 392, 394, 420, 421, 431 indirect object, 26–28, 32<sup>7</sup> , 348, 360 *see also* IO infinitive, 130, 154, 156, 157, 159–161, 218, 223, 229–231 inflectional class, 429, 439, 440 information structure, 20, 63, 64, 67, 71 instrumental, 61, 88<sup>14</sup> , 433, 437, 438, 440, 441 internal argument, 87, 352, 353, 373 interrogative, 2, 4–6, 9–11, 14, 14<sup>21</sup> , 15–17, 17<sup>24</sup> , 18, 197 IO, 348, 351, 353, 360–362 *see also* indirect object language change, 257<sup>5</sup> , 267, 268, 284, 286 LBE, 338, 340, 340<sup>3</sup> , 340<sup>4</sup> , 341–343, 345–348, 350, 350<sup>13</sup> , 351– 353, 368<sup>5</sup>

left branch extraction, *see* LBE left periphery, 46, 194–196, 201, 206, 210, 210<sup>14</sup> , 212 lexical decomposition, 26, 30, 33, 42 lexicalization, 2–5, 7, 7 8 locative, 29, 40, 44–47, 49, 431, 433, 435, 437, 438, 440, 442, 443 main clause, 2, 7, 10, 10<sup>15</sup> , 11, 13, 14, 14<sup>23</sup> , 15, 456, 460 masculine, 151, 170, 171, 171<sup>3</sup> , 172–174, 174<sup>9</sup> , 175, 179, 180, 182, 183, 314–316, 318, 319, 322–324, 326, 328–332, 421<sup>22</sup> , 429, 431–433, 433<sup>1</sup> , 435, 437–443 matrix clause, 156<sup>3</sup> , 218, 223, 224, 231, 234, 235, 462, 471–474 matrix verb, 157, 159, 235, 236, 482– 484, 489–491 modal adverbial, 480, 485–488, 491 mutual entailment, 324, 325, 327, 329, 331, 332 narrow syntax, 169, 170, 177, 178, 181, 186, 362 natural gender, 314, 318, 324, 326– 328, 330–332 negation, 63, 66, 67, 69, 69<sup>11</sup> , 110<sup>1</sup> , 207, 210<sup>15</sup> , 243, 250, 255– 258, 258<sup>6</sup> , 259, 480, 485– 488, 491 neuter, 151, 170–173, 175, 176<sup>11</sup> , 180– 182 nominal, 69, 78, 84, 84<sup>9</sup> , 85, 87–89, 98, 171, 178, 241, 242, 315, 327, 330, 427, 428, 430, 432, 439, 482–484, 489 nominalization, 65, 70

nominative, 111<sup>4</sup> , 174, 185, 223, 227, 230, 231, 358, 359, 365, 370, 371, 403, 403<sup>3</sup> , 405, 412, 413, 416, 419, 428, 430, 431, 433, 433<sup>1</sup> , 435, 437, 439, 443 null object, 381, 383<sup>1</sup> , 385, 386, 392– 394 numeral, 78, 83, 91, 95, 102, 402, 402<sup>1</sup> , 403, 403<sup>2</sup> , 403<sup>3</sup> , 404, 404<sup>4</sup> , 405, 405<sup>4</sup> , 406, 407, 407<sup>9</sup> , 408, 408<sup>10</sup> , 412, 413, 413<sup>16</sup> , 414–419, 421<sup>22</sup> , 422 object drop, 381–383, 388, 392–395 object omission, 382, 383, 390–395 oblique case, 428, 430–432, 437, 439, 440, 443 participle, 53, 56, 57, 59, 60, 67, 70, 71, 174, 219, 220, 250–258, 258<sup>6</sup> , 315 partitive, 93, 103, 104, 407, 417, 417<sup>21</sup> , 418, 419, 422, 423, 483 past, 53, 55–58, 62, 64, 66, 68<sup>10</sup> , 71–73, 137–140, 159, 160, 174, 218–220, 223–231, 235, 236, 251, 253, 258, 260, 292, 293, 456–459, 461–466, 468, 468<sup>9</sup> , 469, 470, 472–475 perfective, 54, 54<sup>1</sup> , 55, 56, 60, 62, 63, 64<sup>8</sup> , 65, 66, 69, 70, 72, 73, 129, 132, 133, 135–137, 142, 219, 221, 228<sup>3</sup> , 258, 289–291, 293–299, 300<sup>10</sup> , 301, 302, 304–306, 306<sup>15</sup> , 307, 308, 392, 480–489, 489<sup>5</sup> , 490– 492 performative speech act, 128, 129, 131, 133–136, 139, 143

plural, 170, 173, 175, 176<sup>11</sup> , 177, 179– 182, 259, 401–403, 403<sup>3</sup> , 405, 406, 408, 411–414, 417–419, 422, 431–433, 433<sup>1</sup> , 439– 444 polar operator, 3, 7, 16, 19 prenominal modifier, 151, 161 present, v, vi, 129, 130, 131<sup>3</sup> , 132, 133, 135–143, 149, 154, 156, 156<sup>3</sup> , 157–159, 159<sup>5</sup> , 160–165, 218, 219, 222, 224, 235, 236, 253, 255, 260, 305, 457<sup>5</sup> , 458, 464, 470, 473–475 presupposition, 30, 64, 136, 198<sup>8</sup> , 325–328, 332, 458, 462– 466, 468, 468<sup>9</sup> , 469, 474, 475, 480–482, 485, 486, 488, 490–492 pretonic, 111, 111<sup>3</sup> , 112, 113, 115, 116, 121, 122 processing, 350<sup>13</sup> , 427–433, 439–444 pronominal, 70, 159, 210<sup>14</sup> , 243, 244, 245<sup>2</sup> , 246, 246<sup>3</sup> , 247–250, 254, 385, 464, 472 pronoun, 60, 63, 69, 96, 178, 315<sup>1</sup> , 375, 384, 389, 394 recoverability, 410, 422, 423 repetitive, 27, 27<sup>3</sup> , 30, 30<sup>5</sup> , 31, 32, 32<sup>7</sup> , 33, 35<sup>9</sup> , 36, 37, 37<sup>10</sup> , 39, 44, 47, 47<sup>13</sup> restitutive, 27, 27<sup>2</sup> , 28, 28<sup>4</sup> , 29–32, 32<sup>7</sup> , 33, 33<sup>8</sup> , 34, 35, 35<sup>9</sup> , 36, 37<sup>10</sup> , 38–44, 47, 47<sup>13</sup> , 48, 49 second position, 11<sup>17</sup> , 201, 207, 232, 235, 243–245, 245<sup>2</sup> , 246– 248, 250, 256, 258<sup>6</sup> sibilant, 274, 285, 286

sluice, 204, 205, 210 small clause, 25–27, 28<sup>4</sup> , 29–32, 32<sup>7</sup> , 34, 39–41, 45, 47, 48 soft consonant, 112–114, 116, 117, 119 specifier, 2, 4, 5 6 , 10, 12<sup>17</sup> , 12<sup>18</sup> , 14–17, 20, 28<sup>4</sup> , 32<sup>7</sup> , 231, 245<sup>2</sup> speech act, 128–131, 131<sup>4</sup> , 132–136, 142, 143, 482, 488, 489, 491 speech act verb, 128–132, 134, 135, 139, 141, 143 spell-out domain, 338, 350, 350<sup>13</sup> , 352, 353 stative, 27, 29–31, 34, 35, 37, 37<sup>10</sup> , 39, 41–48, 61<sup>6</sup> , 101, 102, 227, 228<sup>3</sup> , 229 structural case, 404, 405 subjunctive, 217, 219–226, 229, 231, 232, 235–237 subordinate clause, 223, 480, 480<sup>1</sup> , 486, 490 syncretism, 170, 171<sup>4</sup> , 176, 187, 188, 403<sup>3</sup> , 431, 433, 435, 439 syntax-semantics interface, 26, 170, 177, 181, 185, 186 topicalization, 245, 254, 254<sup>4</sup> , 255, 406, 407, 407<sup>8</sup> , 418, 419, 422, 423 transitive verb, 150, 155, 162, 164 unaccusative, 373, 375, 385 verbal, 37, 39, 55, 56, 59–61, 64, 65, 70–73, 120, 128, 130<sup>2</sup> , 134, 142, 143, 149, 176, 227, 242– 244, 248, 256, 292, 482–484 vowel inventory, 111–114, 116, 117 vowel reduction, 109–113, 115, 117, 120–123

word order, v, 27<sup>2</sup> , 35, 47<sup>13</sup> , 56, 67, 68, 71, 72, 231, 242, 252, 253, 360, 361, 361<sup>2</sup> , 362, 368<sup>5</sup> , 369, 373, 374, 407<sup>8</sup>

# Did you like this book?

## This book was brought to you for free

Please help us in providing free access to linguistic research worldwide. Visit http://www.langsci-press.org/donate to provide financial support or register as a community proofreader or typesetter at http://www.langsci-press.org/register.

## Advances in formal Slavic linguistics 2016

*Advances in Formal Slavic Linguistics 2016* initiates a new series of collective volumes on formal Slavic linguistics. It presents a selection of high quality papers authored by young and senior linguists from around the world and contains both empirically oriented work, underpinned by up-to-date experimental methods, as well as more theoretically grounded contributions. The volume covers all major linguistic areas, including morphosyntax, semantics, pragmatics, phonology, and their mutual interfaces. The particular topics discussed include argument structure, word order, case, agreement, tense, aspect, clausal left periphery, or segmental phonology. The topical breadth and analytical depth of the contributions reflect the vitality of the field of formal Slavic linguistics and prove its relevance to the global linguistic endeavour. Early versions of the papers included in this volume were presented at the conference on Formal Description of Slavic Languages 12 or at the satellite Workshop on Formal and Experimental Semantics and Pragmatics, which were held on December 7–10, 2016 in Berlin.